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5' TO 3' EXONUCIiEASE KOTATZOKS OF 
THERHOSTABIiE DNA POLYMERASES 

cross-Reference to Related Applications 

This is a continuation-in-part (CIP) of copending 
Serial Nos. 590,213, 590,466 and 590,490 all of which 

15 were filed on September 28, 1990, and all of which are 
CIPs of Serial No* 523,394, filed May 15, 1990, which 
is a CIP of eUaandoned Serial No. 143,441, filed January 
12, 1988, which is a CIP of Serial No. 063,509, filed 
Jtine 17, 1987, which issued as United States Patent No. 

20 4,889,818 and which is a CIP of abandoned Serial No. 
899,241, filed August 22, 1986. 

This is a also a CIP of Serial No. 746,121 filed 

August 15, 1991 which is a CIP of: 1) PCT/US90/07641, 
filed December 21, 1990, which is a CIP of Serial No. 
25 585,471, filed September 20, 1990, which is a CIP of 
Serial No. 455,611, filed December 22, 1989, which is a 
CIP of Serial No. 143,441, filed January 12, 1988 and 
its ancestors as described above; and 2) Serial No. 

609,157, filed November 2, 1990, which is a CIP of 
30 Serial No. 557,517, filed July 24, 1990. 

This CIP is also related to the following patent 
applications : 

U.S. Serial No. 523,394, filed May 15, 1990; 
35 U.S. Serial No. 455,967, filed December 22, 1989; 

PCT Application No. 91/05571, filed August 6, 1991; 
PCT Application No. 91/05753, filed August 13, 1991. 

All of the patent applications referenced in this 
4 0 section are incorporated herein by reference. 
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Background of the Invention 
Field of the Invention 

5 The present invention relates to thermostable DNA 

polymerases which have been altered or mutated such 
that a different level of 5' to 3' exonuclease activity 
is exhibited from that which is exhibited by the native 
enzyme. The present invention also relates to means 

10 for isolating and producing such altered polymerases. 
Thermostable DNA polymerases are useful in many 
recombinemt DNA techniques, especially nucleic acid 
amplification by the polymerase chain reaction (PGR) 
self-sustained sequence replication (3SR) , emd high 

15 temperature DNA sequencing. 

Background Art 

Extensive research has been conducted on the 

20 isolation of DNA polymerases from mesophilic 
microorganisms such as E^. coli . See, for example, 
Bessman et al. , 1957, J. Biol , Chem . 223 :171-177 and 
Buttin and Kornberg, 1966, J. Biol . Chem . 241 x5419-5427. 
Somewhat less investigation has been made on the 

25 isolation and purification of DNA polymerases from 
thermophiles such as Thermus aauaticus , Thermus 
thermophilus . Thermotoga maritima , Thermus species 
sps 17, Thermus species 205 and Thermos ioho africanus . 
The use of thermostable enzymes to amplify existing 

30 nucleic acid sequences in amounts that are large 
compared to the amount initially present was described 
in United States Patent Nos. 4 , 683 , 195 and 4,683,202, 
which describe the PGR process, both disclosures of 
which are incorporated herein by reference. Primers, 

35 template, nucleoside triphosphates, the appropriate 
buffer and reaction conditions, and polymerase are used 
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in the PCR process, which involves denaturation of 
target DNA, hybridization of primers, and synthesis of 
complementary strands. The extension product of each 
primer becomes a template for the production of the 
5 desired nucleic acid sequence. The two patents 
disclose that, if the polymerase employed is a 
thermostsQ^le enzyme, then polymerase need not be added 
after every denaturation step, because heat will not 
destroy the polymerase activity. 

10 United States Patent No. 4,889,818, European Patent 
Publication No. 258,017 and PCT Publication No. 
89/06691, the disclosures of which are incorporated 
herein by reference, all describe the isolation and 
recombinant expression of an -94 kDa thermostable DNA 

15 polymerase from Thermus aouaticus and the use of that 
polymerase in PCR. Although 2. aouaticus DNA 
polymerase is especially preferred for use in PCR and 
other recombinant DNA techniques, there remains a need 
for other thermostable polymerases. 

20 

Summary of the Invention 

In addressing the need for other thermostable 
polymerases, the present inventors found that some 

25 thermostable DNA polymerases such as that isolated from 
Thermus acruaticus ( Tag ) display a 5' to 3 ' exonuclease 
or structure-dependent single-stranded endonuclease 
(SDSSE) activity. As is explained in greater detail 
below, such 5' to 3' exonuclease activity is un- 

3 0 desirable in an enzyme to be used in PCR, because it 
may limit the amount of product produced and contribute 
to the plateau phenomenon in the normally exponential 
accumulation of product. Furthermore, the presence of 
5' to 3' nuclease activity in a thermostable DNA polym- 

35 erase may contribute to an impaired ability to effi- 
ciently generate long PCR products greater than or 
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egual to 10 kb particularly for G+C-rich targets. In 
DNA sequencing applications and cycle sequencing appli- 
tions, the presence of 5' to 3' nuclease activity may 
contribute to reduction in desired band intensities 
5 and/or generation of spurious or background bands. 
Finally / the absence of 5' to 3' nuclease activity may 
facilitate higher sensitivity allelic discrimination in 
a combined polymerase ligase chain reaction (PLCR) 
assay. 

10 However, an enhanced or greater amoiint of 5' to 3' 

exonuclease activity in a thermostable DNA polymerase 
may be desirable in such an enzyme which is used in a 
homogeneous assay system for the concurrent amplifica- 
tion and detection of a target nucleic acid sequence. 

15 Generally, an enhanced 5' to 3' exonuclease activity is 
defined an enhanced rate of exonuclease cleavage or an 
enhanced rate of nick-translation synthesis or by the 
displacement of a larger nucleotide fragment before 
cleavage of the fragment. 

20 Accordingly, the present invention was developed to 

meet the needs of the prior art by providing theirmo- 
stable DNA polymerases which exhibit altered 5' to 3' 
exonuclease activity. Depending on the purpose for 
which the thermostable DNA polymerase will be used, the 

25 5' to 3' exonuclease activity of the polymerase may be 
altered such that a range of 5' to 3' exonuclease 
activity may be expressed. This range of 5' to 3' 
exonuclease activity extends from an enhanced activity 
to a complete lack of activity. Although enhanced 

3 0 activity is useful in certain PGR applications, e. g. a 
homogeneous assay, as little 5' to 3' exonuclease 
activity as possible is desired in thermostzible DNA 
polymerases utilized in most other PGR applications. 

It was also found that both site directed 

35 mutagenesis as well as deletion mutagenesis may result 
in the desired altered 5' to 3' exonuclease activity in 
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ttiB tiiennostable DNA polymerases of t:he present 
invention. Some mutations which alter the exonuclease 
activity have been shown to alter the processivity of 
the DNA polymerase. In many applications (e.g. 
5 cunplif ication of moderate sized targets in the presence 
of a large amount of high complexity genomic DNA) 
reduced processivity may simplify the optimization of 
PCRs and contribute to enhanced specificity at high 
enzyme concentration. Some mutations which eliminate 
10 5' to 3' exonuclease activity do not reduce and may 
enhance the processivity of the thermostsdsle DNA 
polymerase and accordingly, these mutant enzymes may be 
preferred in other applications (e.g. generation of 
long PGR products) . Some mutations which eliminate the 
15 5' to 3' exonuclease activity simultaneously enhance, 
relative to the wild type, the themoresi stance of the 
mutant thermostable polymerase, and thus, these mutant 
enzymes find additional utility in the amplification of 
G+C-rich or otherwise difficult to denature targets. 
20 Particular common regions or domains of thermo- 

stable DNA polymerase genomes have been identified as 
preferred sites for mutagenesis to affect the enzyme's 
5' to 3' exonuclease. These domains can be isolated 
and inserted into a thermostable DNA polymerase having 
25 none or little natural 5' to 3' exonuclease activity to 
enhance its activity. Thus, methods of preparing 
chimeric thermoststble DNA polymerases with altered 5' 
to 3 ' exonuclease are also encompassed by the present 
invention. 

30 

Detailed Description of the Invention 

The present invention provides DNA sequences and 
expression vectors that encode thermostable DNA 
35 polymerases which have been mutated to alter the 
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expression of 5' to 3' exonuclease. To facilitate 
tinderstanding of the invention, a ntunber of terms are 
. defined belov. 

5 The terms "cell", "cell line", and "cell culture" 

can be used interchangeably and all such designations 
include progeny. Thus, the words "transformants" or 
"transformed cells" include the primary transformed 
cell and cultures derived from that cell without regard 

10 to the number of transfers. All progeny may not be 
precisely identical in DNA content, due to deliberate 
or inadvertent mutations. Mutant progeny that have the 
SBMB functionality as screened for in the originally 
transformed cell are included in the definition of 

15 transformants. 

The term "control sequences" refers to DNA 
sequences necessary for the expression of an operably 
linked coding sequence in a particular host organism. 
The control sequences that are suitable for 

20 procaryotes, for excunple, include a promoter, 
optionally an operator sequence, a ribosome binding 
site, and possibly other sequences. Eucaryotic cells 
are known to utilize promoters, polyadenylation 
signals, and enhancers. 

25 The term "expression system" refers to DNA 

sequences containing a desired coding sequence and 
control sequences in operable linkage, so that hosts 
transformed with these sequences are capable of 
producing the encoded proteins. To effect 

30 transformation, the expression system may be included 
on a vector; however, the relevant DNA may also be 
integrated into the host chromosome. 

The term "gene" refers to a DNA sequence that 
comprises control and coding sequences necessary for 

35 the production of a recovercible bioactive polypeptide 
or precursor. The polypeptide can be encoded by a full 
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length coding sequence or by any portion of the coding 
sec[uence so long as the enzymatic activity is retained. 

The term "oper£d>ly linked" refers to the 
positioning of the coding sequence such that control 
5 sequences will function to drive expression of the 
protein encoded by the coding sequence. Thus, a coding 
sequence "operably linked" to control sequences refers 
to a configuration wherein the coding sequences can be 
expressed under the direction of a control sec[uence. 

10 The term "mixture" as it relates to mixtures 
containing thermostable polymerases refers to a 
collection of materials which includes a desired 
thermostable polymerase but which can also include 
other proteins. If the desired thermostable polymerase 

15 is derived from recombinant host cells, the other 
proteins will ordinarily be those associated with the 
host . Where the host is bacterial , the contaminating 
proteins will, of course, be bacterial proteins. 

The term "non-ionic polymeric detergents" refers to 

20 surface-active agents that have no ionic charge and 
that are characterized for purposes of this invention, 
by an ability to stabilize thermostable polymerase 
enzymes at a pH range of from about 3.5 to about 9.5, 
preferably from 4 to 8.5- 

25 The term "oligonucleotide" as used herein is 
defined as a molecule comprised of two or more 
deoxyribonucleotides or ribonucleotides, preferably 
more than three, and usually more than ten. The exact 
size will depend on many factors, which in turn depends 

30 on the ultimate function or use of the 
oligonucleotide. The oligonucleotide may be derived 
synthetically or by cloning. 

The term "primer" as used herein refers to an 
oligonucleotide which is capable of acting as a point 

35 of initiation of synthesis when placed under conditions 
in which primer extension is initiated. An 
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Oligonucleotide "primer" may occur naturally, as in a 
purified restriction digest or be produced 
synthetically. Synthesis of a primer extension product 
which is complementary to a nucleic acid stremd is 
5 initiated in the presence of foxir different nucleoside 
triphosphates and a thermostable polymerase enzyme in 
an appropriate buffer at a suitable temperature* A 
"buffer" includes cof actors (such as divalent metal 
ions) and salt (to provide the appropriate ionic 

10 strength) , adjusted to the desired pH. 

A primer is single-stranded for maximum efficiency 
in euaplif ication, but may alternatively be 
double-stranded. If double-stranded, the primer is 
first treated to separate its strands before being used 

15 to prepare extension products. The primer is usually 
an oligodeoxyribonucleotide. The primer must be 
sufficiently long to prime the synthesis of extension 
products in the presence of the polymerase enzyme. The 
exact length of a primer will depend on many factors, 

20 such as source of primer and result desired, and the 
reaction temperature must be adjusted depending on 
primer length and nucleotide sequence to ensure proper 
annealing of primer to template. Depending on the 
complexity of the target sequence, an oligonucleotide 

25 primer typically contains 15 to 35 nucleotides. Short 
primer molecules generally require lower temperatures 
to form sufficiently stcible complexes with template. 

A primer is selected to be "substantially" 
complementary to a strand of specific sequence of the 

30 template. A primer must be sufficiently complementary 
to hybridize with a template strand for primer 
elongation to occur. A primer sequence need not 
reflect the exact sequence of the template . For 
example, a non-complementary nucleotide fragment may be 

35 attached to the 5' end of the primer, with the 
remainder of the primer sequence being substantially 
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complementary to the strand. Non*complementary bases 
or longer sequences can be interspersed into the 
* primer, provided that the primer sequence has 
sufficient complementarity with the sequence of the 
5 template to hybridize and thereby form a template 
primer complex for synthesis of the extension product 
of the primer. 

The terms "restriction endonucl eases'* and 
"restriction enzymes" refer to bacterial enzymes which 

10 cut double-stranded DNA at or near a specific 
nucleotide sequence. 

The term "thermostable polymerase enzyme" refers to 
an enzyme which is steJsle to heat and is heat resistant 
and catalyzes (facilitates) combination of the 

15 nucleotides in the proper manner to form primer 
extension products that are complementary to a template 
nucleic acid strand. Generally, synthesis of a primer 
extension product begins at the 3' end of the primer 
and proceeds in the 5 ' direction along the tempi ate 

20 strand, until synthesis terminates. 

In order to further facilitate understanding of the 

invention, specific thermostable DNA polymerase enzymes 
are referred to throughout the specification to 

exemplify the broad concepts of the invention, and 
25 these references are not intended to limit the scope of 
the invention. The specific enzymes which are 
frequently referenced are set forth below with a common 
abbreviation which will be used in the specification 
and their respective nucleotide and amino acid Sequence 
30 ID numbers. 

Thermostable DNA Common 

Polvmerase Abbr. SEC. ID NO; 

35 Thermus acmaticus Tag SEQ ID N0:1 (nuc) 

SEQ ID NO:2 (a. a. ) 
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15 



Thermotoaa roaritima 




SEQ 


ID 


NO: 3 


(nuc) 






SEQ 


ID 


NO: 4 


(a.a. ) 


Tnermus specxes spsiv 


TSDS17 


SEQ 


ID 


NO:5 


(nuc) 






SEQ 


ID 


NO: 6 


(a.a. ) 


Tnermus specxes ZOd 


TZ05 


SEQ 


ID 


NO: 7 


(nuc) 






SEQ 


ID 


NO: 8 


(a.a. ) 


Thennus thermoiDhilus 




SEQ 


ID 


NO: 9 


(nuc) 






SEQ 


ID 


NO:10 


(a.a. ) 


Thermoslpho africanus 


Taf 


SEQ 


ID 


NO: 11 


(nuc) 






SEQ 


ID 


NO: 12 


(a.a. ) 



As summarized above, the present invention relates 
to thermostable DNA polymerases which exhibit altered 
5' to 3' exonuclease activity from that of the native 
polymerase. Thus, the polymerases of the invention 
25 exhibit either em enhanced 5' to 3' exonuclease 
activity or an attenuated 5' to 3' exonuclease activity 
from that of the native polymerase. 

Thermostable DNA Polymerases With Attenuated 
30 5^ to 3^ Exonuclease Activity 

DNA polymerases often possess multiple functions. 
In addition to the polymerization of nucleotides E. 
coll DNA polymerase I (pol I) , for example, catalyzes 

35 the pyrophosphorolysis of DNA as well as the hydrolysis 
of phosphodiester bonds. Two such hydrolytic 
activities have been characterized for pol I; one is a 
3' to 5' exonuclease activity and the other a 5' to 3' 
exonuclease activity . The two exonuclease activities 

40 are associated with two different domains of the pol I 
molecule. However, the 5' to 3' exonuclease activity 
of pol I differs from that of thermostable DNA 
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polymerases in tJiat: the 5 ' to 3 ' exonuclease activity 
of thermostable DNA polymerases has stricter structural 
requirements for the svibstrate on which it acts. 

An appropriate and sensitive assay for the 5' to 3' 
5 exonuclease activity of thermosteible DNA polymerases 
takes advantage of the discovery of the structural 
requirement of the activity. An important feature of 
the design of the assay is an upstreaun oligonucleoside 
primer which positions the polymerase appropriately for 
10 exonuclease cleavage of a labeled downstream 
oligonucleotide probe. For an assay of polymerization- 
independent exonuclease activity (i.e., an assay 
performed in the absence of deoxynucleoside 
triphosphates) the probe must be positioned such that 
15 the region of probe complementary to the template is 
immediately adjacent to the 3 '-end of the primer. 
Additionally, the probe should contain at least one, 
but preferably 2-10, or most preferably 3-5 nucleotides 
at the 5 '-end of the probe which are not complementary 
20 to the template. The combination of the primer and 
probe when annealed to the template creates a dotible 
stranded structure containing a nick with a 3 ' -hydroxyl 
5' of the nick, and a displaced single strand 3' of the 
nick. Alternatively, the assay can be performed as a 
25 polymerization-dependent reaction, in which case each 
deoxynucleoside triphosphate should be included at a 
concentration of between 1 and 2 mM, preferably 

between 10 yM and 200 pM, although limited dNTP 
addition (and thus limited dNTP inclusion) may be 
30 involved as dictated by the template sequence. When 
the assay is performed in the presence of dNTPs, the 
necessary structural requirements are an upstresan 
oligonucleotide primer to direct the synthesis of the 
complementary strand of the template by the polymerase, 
35 and a labeled downstream oligonucleotide probe which 
will be contacted by the polymerase in the process of 
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An exeunple of a 
polymerization-independent thermostable DNA polymerase 
5' to 3' exonuclease assay follows. 

The synthetic 3 ' phosphorylated oligonucleotide 
5 probe (phosphoirylated to preclude polymerase extension) 

BW33 (GATCGCTGCGCGTAACCACCACACCCGCCGCGCp) (SEQ ID 

NO: 13) (100 pmol) was ^^p^^^beled at the 5' end with 
gamma- [^2p] ^rpp (3000 Ci/mmol) and T4 polynucleotide 
kinase. The reaction mixture was extracted with 

10 phenol: chloroform :isoamyl alcohol, followed by ethanol 
precipitation. The ^^P-labeled oligonucleotide probe 
was redissolved in 100 >il of TE buffer, and 
unincorporated ATP was removed by gel filtration 
chromatography on a Sephadex G-50 spin column. Five 

15 pmol of •^^p.igQ^^i^^j BW33 probe, was annealed to 5 pmol 
of single-strand M13mplOw DNA, in the presence of 
5 pmol of the synthetic oligonucleotide primer BW37 

( GCGCTAGGGCGCTGGCAAGTGTAGCGGTCA) ( SEQ ID NO : 14 ) in a 

100 >il reaction containing 10 mM Tris-HCl (pH 8.3), 

20 50 mM KCl, and 3 mM MgCl2. The annealing mixture was 
heated to 95 'C for 5 minutes, cooled to 70 over 10 
minutes, incvibated at 70 'C for an additional 10 
minutes, and then cooled to 25 'C over a 30 minute 
period in a Perkin-Elmer Cetus DNA Thermal Cycler. 

25 Exonuclease reactions containing 10 \xl of the annealing 
mixture were pre-incxobated at 70 "C for 1 minute. 
Thermostable DNA polymerase enzyme (approximately 0.01 
to 1 unit of DNA polymerase activity, or 0.0005 to 0.05 
pmol of enzyme) was added in a 2.5 >il volume to the 

30 pre-inctibation reaction, and the reaction mixture was 
incubated at 70 'C. Aliquots (5 >il) were removed after 
1 minute and 5 minutes, and stopped by the addition of 
1 \il of 60 mM EDTA. The reaction products were 
analyzed by homochromatography and exonuclease activity 

35 was quantified following autoradiography. 

Chromatography was carried out in a homochromatography 
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mix containing 2% partially hydrolyzed yeast RNA in 7M 
urea on Polygreua CEL 300 DEAE cellulose thin layer 
chromatography plates. The presence of 5' to 3' 
exonuclease activity results in the generation of small 
5 ^^p.iabeied oligomers, which migrate up the TLC plate, 
» and are easily differentiated on the autoradiogram from 

undegraded probe, which remains at the origin. 

The 5' to 3' exonuclease activity of the 
thermostable DNA polymerases excises 5' terminal 
10 regions of double-stranded DNA releasing 5 '-mono- and 
oligonucleotides in a secpiential manner* The preferred 
substrate for the exonuclease is displaced single- 
stranded DNA, with hydrolysis of the phosphodiester 
bond occurring between the displaced single-stranded 
15 DNA and the double-helical DNA. The preferred 
exonuclease cleavage site is a phosphodiester bond in 
the double helical region. Thus, the exonuclease 
activity can be better described as a 
structure-dependent single-stranded endonuclease 

20 (SDSSE) . 

Many thermostable polymerases exhibit this 5' to 3' 
exonuclease activity, including the DNA polymerases of 
Tag , Tma . Tspsl? , TZ05 , Tth and Taf . When thermostable 
polymerases which have 5' to 3' exonuclease activity 

25 are utilized in the PGR process, a variety of 
undesirable results have been observed including a 
limitation of the amount of product produced, an 
impaired ability to generate long PGR products or 
eunplify regions containing significant secondary 

30 structure, the production of shadow bands or the 
w attenuation in signal strength of desired termination 

bands during DNA sequencing, the degradation of the 
5 '-end of oligonucleotide primers in the context of 
double-stranded primer-template complex, nick- 
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translation synthesis during oligonucleotide-directed 
mutagenesis and the degradation of the RMA component of 
KNA:ONA hybrids. 

The limitation of the amount of PCR product 
5 produced is attributable to a plateau phenomenon in the 
otherwise exponential accumulation of product. Such a 
plateau phenomenon occurs in part because 5' to 3' 
exonuclease activity causes the hydrolysis or cleavage 
of phosphodiester bonds when a polymerase with 5' to 3' 
10 exonuclease activity encounters a forked structure on a 
PCR substrate. 

Such forked structures commonly exist in certain G- 
and C-rich DNA templates. The cleavage of these 
phosphodiester bonds under these circumstances is 

15 undesirable as it precludes the amplification of 
certain G- and C-rich targets by the PGR process. 
Furthermore, the phosphodiester bond cleavage also 
contributes to the plateau phenomenon in the generation 
of the later cycles of PCR when product strand 

20 concentration and renaturation kinetics result in 
forked structure substrates. 

In the context of DNA sequencing, the 5' to 3' 
exonuclease activity of DNA polymerases is again a 
hinderance with forked structure templates because the 

25 phosphodiester bond cleavage during the DNA extension 
reactions results in "false stops". These "false 
stops" in turn contribute to shadow bands, and in 
extreme circtimstances may result in the absence of 
accurate and interpretable sequence data. 

30 When utilized in a PCR process with double-stranded 

primer-template complex, the -5' to 3' exonuclease 

activity of a DNA polymerase may result in the 
degradation of the 5 '-end of the oligonucleotide 
primers. This activity is not only undesirable in PCR, 

35 but also in second-strand cDNA synthesis and sequencing 
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Dxiring optimally efficient oligonucleotide-directed 
mutagenesis processes, the DNA polymerase which is 
utilized must not have strand-displacement synthesis 
and/or nick-translation capcibility. Thus, the presence 
5 of 5' to 3' exonuclease activity in a polymerase used 
for oligonucleotide-directed mutagenesis is also 



Finally, the 5' to 3 ' exonuclease activity of 
polymerases generally also contains an inherent RNase H 

10 activity. However, when the polymerase is also to be 
used as a reverse transcriptase, as in a PGR process 
including an RNA:DNA hybrid, such an inherent RNase H 
activity may be disadvantageous. 

Thus, one aspect of this invention involves the 

15 generation of thermostable DNA polymerase mutants 
displaying greatly reduced, attenuated or completely 
eliminated 5' to 3' exonuclease activity. Such mutant 
thermostable DNA polymerases will be more suitable and 
desirable for use in processes such as PGR, second- 

20 strand cDNA synthesis, sequencing and oligonucleotide- 
directed mutagenesis. 

The production of thermostable DNA polymerase 
mutants with attenuated or eliminated 5' to 3' 
exonuclease activity may be accomplished by processes 

25 such as site-directed mutagenesis and deletion 
mutagenesis • 

For excunple, a site-directed mutation of G to A in 
the second position of the codon for Gly at residue 46 
in the Tag DNA polymerase amino acid sequence (i.e. 

3 0 mutation of G(137) to (A) in the DNA sequence has been 
found to result in an approximately 1000-fold reduction 
of 5' to 3' exonuclease activity with no apparent 
change in polymerase activity, processivity or 
extension rate. This site-directed mutation of the Tag 

3 5 DNA polymerase nucleotide sequence results in an amino 
acid change of Gly (46) to Asp. 
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Glycine 46 of Tag DNA polymerase is conserved in 
Thermus species spsl7 DNA polymerase, but is located at 
residue 43, and the same Gly to Asp mutation has a 
similar effect on the 5' to 3' exonuclease activity of 
5 Tspsl? DNA polymerase. Such a mutation of the con- 
served Gly of Ttb (Gly 46) , TZ05 (Gly 46) , Tma (Gly 37) 
and Taf (Gly 37) DNA polymerases to Asp also has a 
similar attenuating effect on the 5' to 3' exonuclease 
activities of those polymerases. 

10 Tspsl7 Gly 43, 31th Gly 46, TZ05 Gly 46, 2ma Gly 37 

and aiaf Gly 37 are also found in a conserved A(V/T)YG 
(SEQ ID NO: 15) sequence domain, emd changing the 
glycine to aspartic acid within this conserved sequence 
domain of any polymerase is also expected to attenuate 

15 5' to 3' exonuclease activity. Specifically, Tspsl7 
Gly 43, EtJi Gly 46, TZ05 Gly 46, and Saf Gly 37 share 
the AVYG sequence domain, and Tma Gly 37 is found in 
the ATYG domain. Mutations of glycine to aspartic acid 
in other thermostcible DNA polymerases containing the 

20 conserved A(V/T)YG (SEQ ID N0:15) domain can be 
accomplished utilizing the same principles and 
techniques used for the site-directed mutagenesis of 
Tag polymerase. Exemplary of such site-directed 
mutagenesis techniques are Example 5 of U.S. Serial 

25 No. 523,394, filed May 15, 1990, Example 4 of Attorney 
Docket No. 2583.1 filed September 27, 1991, Excunples 4 
and 5 of U.S. Serial No, 455,967, filed December 22, 
1989 and Examples 5 and 8 of PCT Application No. 
91/05753, filed August 13, 1991. 

30 Such site-directed mutagenesis is generally 

accomplished by site-specific primer-directed 
mutagenesis. This technique is now standard in the 
art, and is conducted using a synthetic oligonucleotide 
primer complementary to a single-stranded phage DNA to 

3 5 be mutagenized except for limited mismatching, 
representing the desired mutation. Briefly, the 
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synthetic oligonucleotide is used as a primer to direct 
synthesis of a strand complementary to the phasmid or 
phage, and the resulting double-stranded DNA is 
transformed into a phage-supporting host bacterixim. 
5 Cultures of the transformed bacteria are plated in top 
agar, permitting plague formation from single cells 
that harbor the phage or plated on drug selective media 
for phasmid vectors. 

Theoretically, 50% of the new plagues will contain 

10 the phage having, as a single strand, the mutated form; 
50% will have the original sequence. The plagues are 
tranferred to nitrocellulose filters and the "lifts" 
hybridized with kinased synthetic primer at a 
temperature that permits hybridization of an exact 

15 match, but at which the mismatches with the original 
strand are sufficient to prevent hybridization. 
Plaques that hybridize with the probe are then picked 

and cultured, and the DNA is recovered. 

In the constructions set forth below, correct 

20 ligations for plasmid construction are confirmed by 
first transforming E. coli strains DG98, DGlOl, DG116, 
or other suitable hosts, with the ligation mixture. 
Successful transf ormants are selected by ampicillin, 
tetracycline or other antibiotic resistance or using 

25 other markers, depending on the mode of plasmid 
constiniction, as is understood in the art. Plasmids 
from the transf ormants are then prepared according to 
the method of Clewell, D.B., et al., Proc, Natl. Acad. 
Sci, fUSA) (1969) 62:1159, optionally following 

30 chloreanphenicol amplification (Clewell, D.B., 

Bacterid . (1972) 110 ; 667) . The isolated DNA is 
analyzed by restriction and/or sequenced by the dideoxy 
method of Sanger, F. , et al., Proc. N atl. Acad. Sci. 
ruSA) (1977) 74:5463 as further described by Messing, 
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et al.. Nucleic Acids Res. (1981) 2:309, or by the 
method of Maxam, et al.. Methods in EnzymolocrY (1980) 
£5:499. 

For cloning and sequencing, and for expression of 
5 constructions under control of most lac or Pj^ 
promoters, £i coli strains DG98, D6101, D6116 were used 
as the host. For expression under control of the 
^L^RBS promoter, ^ coli strain K12 MClOOO lambda 
lysogen, N7N53CI857 SusPgQ/ ATCC 39531 may be used. 
10 Exemplary hosts used herein for expression of the 
thermostable DNA polymerases with altered 5' to 3' 
exonuclease activity are Ej_ coli DG116, which was 
deposited with ATCC (ATCC 53606) on April 7, 1987 and 
coli KB2, which was deposited with ATCC (ATCC 53075) 
15 on March 29, 1985. 

For Ml 3 phage recombinants, £^ coli strains 
susceptible to phage infection, such as E-a. coli K12 
strain D698, are employed. The DG98 strain has been 
deposited with ATCC July 13, 1984 and has accession 

20 number 39768. 

Mammalian expression can be accomplished in COS-7 
COS-A2, CV-1, and murine cells, and insect cell-based 
expression in Spodoptera fruaipeida . 

The thermostable DNA polymerases of the present 

25 invention are generally purified from E. coli strain 
DG116 containing the features of plasmid pIiSG33. The 
primary features are a temperature regulated promoter 
(X promoter) , a temperature regulated plasmid 

vector, a positive retro-regulatory element (PRE) (see 

30 U.S. 4,666,848, issued May 19, 1987), and a modified 
form of a thermostable DNA polymerase gene. As 
described at page 46 of the specification of U.S patent 
application Serial No. 455,967, pI*SG33 was prepared by 
ligating the Ndel-BamHI restriction fragment of pLSG24 

35 into expression vector pDG178. The resulting plasmids 
are ampicillin resistant and capable of expressing 5' 
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'to 3' exonuclease deficient forms of "the tJiermosteOsle 
DNA polymerases of tihe present invention. The seed 
flask for a 10 liter fermentation contains tryptone (20 
g/1) / yeast extract (10 g/1) , NaCl (10 g/1) and 0.005% 
5 ampicillin. The seed flask is inoculated from colonies 
from an agar plate, or a frozen glycerol culture stock 
can be used. The seed is grown to between 0.5 and 1.0 
O.D. (Ag3o) • The volume of seed culture inoculated 
into the fermentation is calculated such that the final 

10 concentration of bacteria will be 1 mg dry 
weight/liter. The 10 liter growth medium contained 
25 mM KH2PO4, 10 mM (NH4)2 SO4 , 4 mM sodium citrate, 
0.4 mM FeCl2r 0.04 mM ZnCl2/ 0.03 mM C0CI2/ 0.03 mM 
CUCI2/ and 0.03 mM H3BO3. The following sterile 

15 components are added: 4 mM MgS04, 20 g/1 glucose, 
20 mg/1 thiamine-HCl and 50 mg/1 ampicillin. The pH 
was adjusted to 6.8 with NaOH and controlled during the 
fermentation by added NH4OH. Glucose is continually 
added during the fermentation by coupling to NH4OH 

20 addition. Foaming is controlled by the addition of 
polypropylene glycol as necessary, as an anti-foaming 
agent. Dissolved oxygen concentration is maintained at 
40%. 

The fermentation is inoculated as described above 
25 and the culture is grown at 30 until an optical 
density of 21 (AgsQ) is reached. The temperature is 
then raised to 37'C to induce synthesis of the desired 
polymerase. Growth continues for eight hours after 
induction, and the cells are then harvested by 
30 concentration using cross flow filtration followed by 
centrifugation. The resulting cell paste is frozen at 
-70 'C and yields about 500 grams of cell paste. Unless 
otherwise indicated, all purification steps are 
conducted at 4'C. 
35 A portion of the frozen (-70'C) E. coli K12 strain 

DG116 harboring plasmid pLSG33 or other suitable host 
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as described above is warmed overnight to -20*C. To 
the cell pellet the following reagents are added: 
1 volume of 2X TE (100 mM Tris-HCl, pH 7.5, 20 inM 
EDTA) , 1 mg/ml leupeptin and 144 mM PMSF (in dimethyl 
5 f ormamide) . The final concentration of leupeptin was 
1 iig/ial and for EMSF, 2.4 mM. Preferably, 
dithiothreitol (DTT) is included in TE to provide a 
final concentration of l mM DTT. The mixture is 
homogenized at low speed in a blender. All glassware 

10 is baked prior to use, and solutions used in the 
pxirification are autoclaved, if possible, prior to 
use. The cells are lysed by passage twice through a 
Microfluidizer at 10,000 psi. 

The lysate is diluted with IX TE containing 1 mM 

15 DTT to a final volume of 5.5X cell wet weight. 
Leupeptin is added to 1 ^g/ml and PMSF is added to 2.4 
mM. The final volume (Fraction I) is approximately 
1540 ml. 

Ammonixim sulfate is gradually added to 0.2 M (26.4 
20 g/1) and the lysate stirred. Upon addition of ammonium 
sulfate, a precipitate forms which is removed prior to 
the polyethylenimine (PEI) precipitation step, 
described below. The ammonium sulfate precipitate is 
removed by centrifugation of the suspension at 15,000 - 
25 20,000 xg in a JA-14 rotor for 20 minutes. The 
supernatant is decanted and retained. The ammonixim 
sulfate supernatant is then stirred on a heating plate 
until the supernatant reaches 75 "C and then is placed 
in a 77 -C bath and held there for 15 minutes with 
3 0 occasional stirring. The supernatant is then cooled in 
an ice bath to 20 and a 10 ml aliquot is removed for 
PEI titration. 

PEI titration and agarose gel electrophoresis are 
used to determine that 0.3% PEI (commercially available 
35 from BDH as PolyminP) precipitates -90% of the 
macromolecular DNA and RNA, i.e., no DNA band is 
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visible on an e'thidltm bromide s1:ained agarose gel 
af'ter treatment with PEL PEI is added slowly witih 
stirring to 0.3% from a 10% stock solution. The PEI 
treated supernatant is centrifuged at 10,000 RPH 
5 (17,000 xg) for 20 minutes in a JA-14 rotor. The 
supernatant is decanted and retained. The volume 
(Fraction II) is approximately 1340 ml. 

Fraction II is loaded onto a 2.6 x 13.3 cm (71 ml) 
phenyl sepharose CL-4B (Pharmacia-LKB) column following 

10 equilibration with 6 to 10 column volumes of TE 
containing 0.2 M ammonium sulfate. Fraction II is then 
loaded at a linear flow rate of 10 cm/hr. The flow 
rate is 0.9 ml/min. The column is washed with 3 column 
volumes of the equilibration buffer and then with 2 

15 coltimn volumes of TE to remove contaminating non-DNA 
polymerase proteins. The recombinant thermostable DNA 
polymerase is eluted with 4 column volumes of 2.5 M 
urea in TE containing 20% ethylene glycol. The DNA 
polymerase containing fractions are identified by 

20 optical absorption (A280) ' polymerase activity 

assay and SDS-PAGE according to standard procedures. 
Peak fractions are pooled and filtered through a 0.2 
micron sterile vacuum filtration apparatus. The volume 
(Fraction III) is approximately 195 ml. The resin is 

25 equilibrated and recycled according to the 
manufacturer's recommendations. 

A 2.6 X 1.75 cm (93 ml) heparin sepharose C1-6B 
colvimn (Pharmacia-LKB) is equilibrated with 6-10 column 
volumes of 0.05 M KCl, 50 mM Tris-HCl, pH 7.5, 0.1 mM 

30 EDTA and 0.2% Tween 20 , at 1 column volume/hour. 
Preferably, the buffer contains 1 mM DTT. The column 
is washed with 3 column volumes of the equilibration 
buffer. The desired thermostable DNA polymerase of the 
invention is eluted with a 10 column volume linear 

35 gradient of 50-750 mM KCl gradient in the same buffer. 
Fractions (one-tenth column volume) are collected in 
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st:erlle ^bubes and the fractions containing the desired 
thermostable DNA polymerase are pooled (Fraction IV^ 
voliime 177 ml) • 

Fraction IV is concentrated to 10 ml on an Amicon 
5 YM30 membrane • For buffer exchange, diaf iltration is 
done 5 times with 2.5X storage buffer (50 mM Tris-HCl, 

pH 7.5, 250 mM KCl, 0.25 mM EDTA 2.5 mM DTT cOid 0.5% 

Tween-20 ) by filling the concentrator to 20 ml and 
concentrating the volumes to 10 ml each time. The 

10 concentrator is emptied and rinsed with 10 ml 2.5X 
storage buffer which is combined with the concentrate 
to provide Fraction V. 

Anion exchange chromatography is used to remove 
residual DNA. The procedure is conducted in a 

15 biological safety hood and sterile techniques are 
used. A Waters Sep-Pak plus QMA cartridge with a 0.2 
micron sterile disposable syringe tip filter unit is 
equilibrated with 3 0 ml of 2.5X storage buffer using a 
syringe at a rate of about 5 drops per second. Using a 

20 disposcdDle syringe. Fraction V is passed through the 
cartridge at about 1 drop/second and collected in a 
sterile tube. The cartridge is flushed with 5 ml of 
2.5 ml storage buffer and pushed dry with air. The 
eluant is diluted 1.5 X with 80% glycerol and stored at 

25 -20 'C. The resulting final Fraction IV pool contains 
active thermostable DNA polymerase with altered 5' to 
3' exonuclease activity. 

In addition to site-directed mutagenesis of a 
nucleotide sequence, deletion mutagenesis techniques 

30 may also be used to attenuate the 5' to 3' exonuclease 
activity of a thermostable DNA polymerase. One example 
of such a deletion mutation is the deletion of all 
amino terminal amino acids up to and including the 
glycine in the conserved A(V/T)YG (SEQ ID NO: 15) domain 

35 of thermostable DNA polymerases. 
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A second deletion mutation affecting 5' to 3' 
exonuclease activity Is a deletion up to Ala 77 In Tag 
DNA polymerase. This amino acid (Ala 77) has been 
identified as the amino terminal amino acid in an 
5 approximately 85.5 kDa proteolytic product of Tag DNA 
polymerase. This proteolytic product has been 
identified in several native Tag DNA polymerase 
preparations and the protein appears to be st£ible. 
since such a deletion up to Ala 77 includes 61y 46, it 
10 will also affect the 5' to 3' exonuclease activity of 

Tag DNA polymerase. 

However, a deletion mutant beginning with Ala 77 
has the added advantage over a deletion mutant 
beginning with phenylalanine 47 in that the proteolytic 

15 evidence suggests that the peptide will remain stable. 
Furthermore, Ala 77 is found within the sequence HEAYG 
(SEQ ID NO: 16) 5 amino acids prior to the sequence YKA 
in Tag DNA polymerase . A similar sequence motif HEAYE 
(SEQ ID NO: 17) is found in Tth DNA polymerase, TZ05 DNA 

20 polymerase and Tspsl7 DNA polymerase. The alanine is 5 
amino acids prior to the conserved motif YKA. The 
amino acids in the other exemplary thermostable DNA 
polymerases which correspond to Tag Ala 77 are Tth Ala 
78, TZ05 Ala 78, Tspsl7 Ala 74, Tma Leu 72 and Taf lie 

25 73. A deletion up to the alanine or corresponding 
amino acid in the motif HEAY(G/E) (SEQ ID NO: 16 or SEQ 
ID NO: 17) in a Thermus species thermostable DNA 
polymerase containing this sequence will attenuate its 
5' to 3' exonuclease activity. The 5' to 3' 

30 exonuclease motif YKA is also conserved in Tma DNA 
polymerase (amino acids 76-78) and Taf DNA polymerase 
(amino acids 77-79). In this thermostable polymerase 
family, the conserved motif (L/I)LET (SEQ ID NO: 18) 
immediately proceeds the YKA motif. Taf DNA polymerase 

35 lie 73 is 5 residues prior to this YKA motif while TMA 
DNA polymerase Leu 72 is 5 residues prior to the YKA 
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motif . A deletion of the Leu or lie in the motif 
(L/I)I-ETYKA (SEQ ID NO: 19) in a thermostable DNA 
polymerase from the Thermotocra or Theirmosipho genus 
will also attenuate 5' td 3' exonuclease activity, 
5 Thus, a conserved amino acid sequence which defines 

the 5' to 3' exonuclease activity of DNA polymerases of 
the Thennus genus as well as those of Thermotoqa and 
Thermos jpho has been identified as (1/L/A)X3YKA (SEQ ID 
NO: 20), wherein X3 is any sequence of three amino 

10 acids. Therefore, the 5' to 3' exonuclease activity of 
thermostable DNA polymerases may also be altered by 
mutating this conserved amino acid domain. 

Those of skill in the art recognize that when such 
a deletion mutant is to be expressed in recombinant 

15 host cells, a methionine codon is usually placed at the 
5' end of the coding sequence, so that the cimino 
terminal sequence of the deletion mutant protein would 
be MET-ALA in the Thermus genus examples above. 

The preferred techniques for performing deletion 

20 mutations involve utilization of known restriction 
sites on the nucleotide sequence of the thermostable 
DNA polymerase. Following identification of the 
particular amino acid or amino acids which are to be 
deleted, a restriction site is identified which when 

25 cleaved will cause the cleavage of the target DNA 
sequence at a position or slightly 3' distal to the 
position corresponding to the amino acid or domain to 
be deleted, but retains domains which code for other 
properties of the polymerase which are desired. 

30 Alternatively, restriction sites on either side (5' 

or 3') of the sequence coding for the target cimino acid 
or domain may be utilized to cleave the sequence. 
However, a ligation of the two desired portions of the 
sequence will then be necessary. This ligation may be 

35 performed using techniques which are standard in the 
art and exemplified in Example 9 of Serial No. 523,394, 
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filed May 15, 1990, Example 7 of PCT Application No. 
91/05753, filed August 13, 1991 and Serial No. 590,490, 
filed September 28, 1990, all of which are incorporated 
herein by reference. 
5 Another technique for achieving a deletion mutation 

of the thermosteJDle DNA polymerase is by utilizing the 
PGR mutagenesis process. In this process, primers are 
prepared which incorporate a restriction site domain 
and optionally a methionine codon if such a codon is 

10 not already present. Thus, the product of the PGR with 
this primer may be digested with an appropriate 
restriction enzyme to remove the domain which codes for 
5' to 3' exonuclease activity of the enzyme. Then, the 
two remaining sections of the product are ligated to 

15 form the coding sequence for a thermostable DNA 
polymerase lacking 5' to 3' exonuclease activity. Such 
coding sequences can be utilized as expression vectors 
in appropriate host cells to produce the desired 
thermostable DNA polymerase lacking 5' to 3' 

20 exonuclease activity. 

In addition to the Tag DNA polymerase mutants with 
reduced 5' to 3' exonuclease activity, it has also been 
found that a truncated Tma DNA polymerase with reduced 
5' to 3' exonuclease activity may be produced by 

25 recombinant techniques even when the complete coding 
sequence of the Tma DNA polymerase gene is present in 
an expression vector in £. coli . Such a truncated Tma 
DNA polymerase is formed by translation starting with 
the methionine codon at position 140. Furthermore, 

3 0 recombinant means may be used to produce a tiruncated 
polymerase corresponding to the protein produced by 
initiating translation at the methionine codon at 
position 284 of the Tma coding sequence. 

The Tma DNA polymerase lacking amino acids 1 though 

35 139 (about 86 kDa) , and the 2ma DNA polymerase lacking 
amino acids 1 through 283 (about 70 kDa) retain 
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polymerase activity but have attenuated 5 ' to 3 ' 
exonuclease activity. An additional advantage of the 
70 kDa Baa DNA polymerase is that it is significantly 
more thermostable than native Tma polymerase. 
5 Thus, it has been found that the entire sequence of 
the intact Saa DNA polymerase I enzyme is not required 
for activity. Portions of the Tma DNA polymerase I 
coding sequence can be used in recombinant DNA 
techniques to produce a biologically active gene 

10 product with DNA polymerase activity. 

Furthermore, the availability of DNA encoding the 
laa DNA polymerase sequence provides the opportunity to 
modify the coding sequence so as to generate mutein 
(mutant protein) forms also having DNA polymerase 

15 activity but with attenuated 5' to 3' exonuclease 
activity. The amino (N) -terminal portion of the Tma DNA 
polymerase is not necessary for polymerase activity but 
rather encodes the 5' to 3' exonuclease activity of the 
protein. 

20 Thus, using recombinant DNA methodology, one can 

delete approximately up to one-third of the N-terminal 
coding sequence of the Tma gene, clone, and express a 
gene product that is quite active in polymerase assays 
but, depending on the extent of the deletion, has no 5' 

25 to 3' exonuclease activity. Because certain N-terminal 
shortened forms of the polymerase are active, the gene 
constructs used for expression of these polymerases can 
include the corresponding shortened forms of the coding 
sequence . 

30 In addition to the N-terminal deletions, individual 

amino acid residues in the peptide chain of Tma DNA 
polymerase or other thermostcQ^le DNA polymerases may be 
modified by oxidation, reduction, or other derivation, 
and the protein may be cleaved to obtain fragments that 

35 retain polymerase activity but have attenuated 5' to 3' 
exonuclease activity. Modifications to the primary 
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Structure of the Tina DNA polymerase coding sequence or 
the coding sequences of other thexinostable DNA 
polymerases by deletion, addition, or alteration so as 
to change the amino acids Incorporated Into the 
5 thermostable DNA polymerase during translation of the 
mRNA produced from that coding sequence can be made 
without destroying the high temperature DNA polymerase 
activity of the protein. 

Another technique for preparing thermostable DNA 

10 polymerases containing novel properties such as reduced 
or enhanced 5' to 3' exonuclease activity Is a "domain 
shuffling" technique for the construction of 
"thermosteJ^le chimeric DNA polymerases". For example, 
substitution of the Tma DNA polymerase coding sequence 

15 comprising codons about 291 through eibout 484 for the 
Tag DNA polymerase I codons 289-422 would yield a novel 
thermostable DNA polymerase containing the 5' to 3' 
exonuclease domain of Tag DNA polymerase (1-289) , the 
3' to 5' exonuclease domain of Tma DNA polymerase 

20 (291-484) , and the DNA polymerase domain of Tag DNA 
polymerase (423-832). Alternatively, the 5' to 3' 
exonuclease domain and the 3' to 5' exonuclease domains 
of Tma DNA polymerase (ca, codons 1-484) may be fused 
to the DNA polymerase (dNTP binding and primer/ template 

25 binding domains) portions of Tag DNA polymerase (ca. 

codons 423-832) • 

As is apparent, the donors and recipients for the 
creation of "thermostable chimeric DNA polymerase" by 

"domain shuffling" need not be limited to Tag and Tma 
3 0 DNA polymerases. Other thermostable polymerases 
provide analogous domains as Tag and Tma DNA 
polymerases. Furthermore, the 5' to 3' exonuclease 
domain may derive from a thermostable DNA polymerase 
with altered 5' to 3' nuclease activity. For example, 
35 the 1 to 289 5' to 3' nuclease domain of Tag DNA 
polymerase may derive from a Gly (4 6) to Asp mutant 
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form of the Tag polymerase gene. Similarly, the 5' to 
3' nuclease and 3' to 5' nuclease domains of Tma DNA 
polymerase may encode a 5' to 3' exonuclease deficient 
domain, and be retrieved as a Tma 61y (37) to Asp amino 
5 acid 1 to 484 encoding DNA fragment or alternatively a 
truncated Met 140 to amino acid 484 encoding DNA 
fragment* 

While any of a variety of means may be used to 
generate chimeric DNA polymerase coding sequences 

10 (possessing novel properties), a preferred method 
employs "overlap" PCR. In this method, the intended 
junction sequence is designed into the PCR primers (at 
their 5 '-ends) • Following the initial euaplif ication of 
the individual domains, the various products are 

15 diluted (ca. 100 to 1000-fold) and combined, denatured, 
annealed, extended, and then the final forward and 
reverse primers are added for an otherwise standard PCR. 

Those of skill in the art recognize that the above 
thermostable DNA polymerases with attenuated 5' to 3' 

20 exonuclease activity are most easily constructed by 
recombinant DNA techniques. When one desires to 
produce one of the mutant enzymes of the present 
invention, with attenuated 5' to 3' exonuclease 
activity or a derivative or homologue of those enzymes, 

25 the production of a recombinant form of the enzyme 
typically involves the construction of an expression 
vector, the transformation of a host cell with the 
vector, and culture of the transformed host cell under 
conditions such that expression will occur. 

30 To construct the expression vector, a DNA is 

obtained that encodes the mature (used here to include 
all chimeras or muteins) enzyme or a fusion of the 
mutant polymerase to an additional sequence that does 
not destroy activity or to an additional sequence 

3 5 cleavable under controlled conditions (such as 
treatment with peptidase) to give an active protein. 
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The coding sequence is then placed in operable linkage 
with suitable control sequences in an expression 
vector. The vector can be designed to replicate 
autonomously in the host cell or to integrate into the 
5 chromosomal DNA of the host cell. The vector is used 
to transform a suitable host, and the transformed host 
is cultured under conditions suitable for expression of 
the recombinant polymerase. 

Each of the foregoing steps can be done in a 
10 variety of ways. For example, the desired coding 
sequence may be obtained from genomic fragments and 
used directly in appropriate hosts. The constiruction 
for expression vectors operable in a variety of hosts 
is made using appropriate repl icons and control 
15 sequences, as set forth generally below. Construction 
of suitable vectors containing the desired coding and 
control sequences employs standard ligation and 
restriction techniques that are well xinderstood in the 
art. Isolated plasmids, DNA sequences, or synthesized 
20 oligonucleotides are. cleaved, modified, and religated 
in the form desired. Suitable restriction sites can, 
if not normally available, be added to the ends of the 
coding sequence so as to facilitate construction of an 
expression vector, as exemplified below. 
25 Site-specific DNA cleavage is performed by treating 

with suitable restriction enzyme (or enzymes) under 
conditions that are generally understood in the art and 
specified by the manufacturers of commercially 
avail203le restriction enzymes. See, e.g.. New England 
30 Bioled3S, Product Catalog. In general, about 1 pg of 
plasmid or other DNA is cleaved by one unit of enzyme 
ii" cibout 20 }xl of buffer solution; in the examples 
below, an excess of restriction enzyme is generally 
used to ensure complete digestion of the DNA. 
35 Incubation times of about one to two hours at about 
37 are typical, although variations can be 
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'tolera'ted. Af1:er each incmbat:ion, pro'teln is removed 
by extraction with phenol and chloroform; this 
extraction can be followed by ether extraction and 
recovery of the DNA from aqueous fractions by 
5 precipitation with ethanol. If desired, size 
separation of the cleaved fragments may be perfoinned by 
polyacryl2aaide gel or agarose gel electrophoresis using 
standard technic[ues. See, e^g.. Methods in EnzymolocrV f 
1980, 6S:499-560. 

10 Restriction-cleaved fragments with single-strand 
"overhcuiging" termini can be made blunt-ended 
(dotible-stremd ends) by treating with the large 
fragment of E. coli DNA polymerase I (Klenow) in the 
presence of the four deoxynucleoside triphosphates 

15 (dNTPs) using incubation times of about 15 to 25 
minutes at 20 to 25 in 50 mM Tris-Cl pH 7.6, 50 mM 
NaCl, 10 mM MgCl2, 10 mM DTT, and 5 to 10 >iM dNTPs. 
The Klenow fragment fills in at 5' protruding ends, but 
chews back protruding 3' single strands, even though 

20 the four dNTPs are present. If desired, selective 
repair can be performed by supplying only one of the, 
or selected, dNTPs within the limitations dictated by 
the nature of the protruding ends. After treatment 
with Klenow, the mixture is extracted with 

25 phenol/chloroform and ethanol precipitated. Similar 
results can be achieved using SI nuclease, because 
treatment under appropriate conditions with SI nuclease 
results in hydrolysis of any single- stranded portion of 
a nucleic acid. 

30 Synthetic oligonucleotides can be prepared using 

the triester method of Matteucci et al . , 1981, J. Am . 
Chem. Soc. 103:3185-3191, or automated synthesis 
methods. Kinasing of single strands prior to annealing 
or for labeling is achieved using an excess, e.g., 

35 approximately 10 units, of polynucleotide kinase to 
0.5 )iM substrate in the presence of 50 mM Tris, pH 7.6, 
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io nM M9CI2/ 5 nM dlthlothrelt:ol (DTT) , and 1 to 2 }iH 
ATP. If kinasing is for labeling of probe, the ATP 
will contain high specific activity y"''^^* 

Ligations are performed in 15-30 ]xl volxmes under 
5 the following standard conditions and temperatures: 
20 mM Tris-Cl, pH 7.5, 10 mM MgCl2, 10 mM DTT, 33 jig/ml 

BSA, 10 mH-50 xnM NaCl, and either 40 \M ATP and 
0.01-0.02 (Weiss) units T4 DNA ligase at 0*C (for 
ligation of fragments with complementary 

10 single-stranded ends) or 1 mM ATP and 0.3-0.6 units T4 
DNA ligase at 14 (for "blunt end" ligation). 
Intermolecular ligations of fragments with 
complementary ends are usually performed at 33-100 
ixg/ml total DNA concentrations (5 to 100 nM total ends 

15 concentration) . Intermolecular blunt end ligations 
(usually employing a 20 to 30 fold molar excess of 
linkers, optionally) are performed at 1 \M total ends 
concentration . 

In vector construction, the vector fragment is 

20 commonly treated with bacterial or calf intestinal 
alkaline phosphatase (BAP or CIAP) to remove the 5' 
phosphate and prevent religation and reconstruction of 
the vector. BAP and CIAP digestion conditions are well 
)cnown in the art , and publ ished protocols usual ly 

25 accompany the commercially available BAP and CIAP 
enzymes. To recover the nucleic acid fragments, the 
preparation is extracted with phenol-chloroform and 
ethanol precipitated to remove the phosphatase and 
purify the DNA. Alternatively, religation of unwanted 

30 vector fragments can be prevented by restriction enzyme 
digestion before or after ligation, if appropriate 
restriction sites are available. 

For portions of vectors or coding sequences that 
require sequence modifications, a variety of 

35 site-specific primer-directed mutagenesis methods are 
available. The polymerase chain reaction (PGR) can be 
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used to perform sl-te-speclfic mutagenesis. In anotlier 
technique now standard in the art^ a synthetic 
oligonucleotide encoding the desired mutation is used 
as a primer to direct synthesis of a complementary 
5 nucleic acid sequence of a single-stranded vector, such 
as pBS13+, that serves as a template for construction 
of the extension product of the mutagenizing primer. 
The mutagenized DNA is transformed into a host 
bacterium, and cultures of the tramsformed bacteria are 

10 plated and identified. The identification of modified 
vectors may involve treuisfer of the DNA of selected 
trans formants to a nitrocellulose filter or other 
membrane and the "lifts" hybridized with kinased 
synthetic primer at a temperature that permits 

15 hybridization of an exact match to the modified 
sequence but prevents hybridization with the original 
strand. Tremsformants that contain DNA that hybridizes 
with the probe are then cultured and serve as a 
reservoir of the modified DNA. 

20 In the constructions set forth below, correct 

ligations for plasmid construction are confirmed by 
first transforming E. coli strain DGlOl or another 
suitable host with the ligation mixture. Successful 
transformants are selected by ampicillin, tetracycline 

25 or other antibiotic resistance or sensitivity or by 
using other markers, depending on the mode of plasmid 
construction, as is understood in the art. Plasmids 
from the transformants are then prepared according to 
the method of Clewell et , 1969, Proc . Natl . Acad . 

30 Sci. USA 62:1159, optionally following chloramphenicol 
amplification (Clewell, 1972, J. Bacteriol > 110 ; 667) . 
Another method for obtaining plasmid DNA is described 
as the "Base-Acid" extraction method at page 11 of the 
Bethesda Research LcQsoratories publication Focus . 

35 volume 5, number 2, and very pure plasmid DNA can be 
obtained by replacing steps 12 through 17 of the 
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protocol with CsCl/ethidlum bromide ultracentrlfugatlon 
of the DNA. The isolated DNA is analyzed by 
restriction enzyme digestion and/or sequenced by the 
dideoxy method of Sanger et al . , 1977, Proc . Natl . 
5 Acad , Sci . USA 21^5463, as further described by Hessing 
et al> . 1981, Nuc . Acids Res . 2:309, or by the method 
of Mcucam et al. , 1980, Methods in Enzvmolocrv ^:499. 

The control sec[uences, expression vectors, and 
transformation methods are dependent on the type of 
10 host cell used to express the gene. Generally, 
procaryotic, yeast, insect, or mammalian cells are used 
as hosts. Procazyotic hosts are in general the most 
efficient and convenient for the production of 
recombinant proteins and are therefore preferred for 
15 the expression of the thermostable DNA polymerases of 
the present invention. 

The procaryote most frequently used to express 
recombinant proteins is E. coli . For cloning and 
sequencing, and for expression of constructions under 
20 control of most bacterial promoters , E. coli K12 strain 
MM294, obtained from the £. coli Genetic Stock Center 
under GCSC #6135, can be used as the host. For 
expression vectors with the PlNrbs control sequence, E. 
coli K12 strain MClOOO lambda lysogen, N7N53CIS57 
25 SusPgQ, ATCC 39531, may be used. E. coli DG116, which 
was deposited with the ATCC (ATCC 53606) on April 7, 
1987, and E. coli KB2 , which was deposited with the 
ATCC (ATCC 53075) on March 29, 1985, are also useful 
host cells. For M13 phage recombinants, E. coli 
30 strains susceptible to phage infection, such as £. coli 
K12 strain DG98, are employed. The DG98 strain was 
deposited with the ATCC (ATCC 39768) on July 13, 1984. 

However, microbial strains other than E. coli can 
also be used, such as bacilli, for example Bacillus 
35 subtilis, various species of Pseudomonas . and other 
bacterial strains, for recombinant expression of the 
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thermostable DNA polymerases of the present invention. 
In such procaryotic systems, plasmid vectors that 
contain replication sites and control sequences derived 
from the host or a species compatible with the host are 
5 typically used. 

For example, £. coli is typically transformed using 
derivatives of pBR322, described by Bolivar gt al. / 
1977, gene 2:95. Plasmid pBR322 contains genes for 
ampicillin and tetracycline resisteuice. These drug 
10 resistance markers can be either retained or destroyed 
in constructing the desired vector and so help to 
detect the presence of a desired recombinant. Commonly 
used procaryotic control sequences, i.e., a promoter 
for transcription initiation, optionally with an 
15 operator, along with a ribosome binding site sequence, 
include the B-lacteunase (penicillinase) and lactose 
(lac) promoter systems (Chang et al- / 1977, Nature 
198:1056), the tryptophan (trp) promoter system 
(Goeddel et ai- r 1980, Nuc , Acids Res . 8:4057), and the 
20 lambda-derived promoter (Shimatake et ai. , 1981, 

Nature 212.: 128) and N-gene ribosome binding site 
(^RBs) • ^ portable control system cassette is set 
forth in United States Patent No. 4,711,845, issued 
December 8, 1987. This cassette comprises a P^ 
25 promoter operably linked to the Njj3s in turn positioned 
upstream of a third DNA sequence having at least one 
restriction site that permits cleavage within six bp 3' 
of the Nj^g sequence. Also useful is the phosphatase A 
(phoA) system described by Chang et ai* in European 
30 Patent Publication No. 196,864, published October 8, 
1986. However, any available promoter system 
compatible with procaryotes can be used to construct a 
modified thermostable DNA polymerase expression vector 
of the invention. 
35 In addition to bacteria, eucaryotic microbes, such 

as yeast, can also be used as recombinant host cells. 
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LedDoratory strains of Saccharomvces cerevisiae . Baker's 
yeast, are most often used, although a number of other 
strains are commonly available. While vectors 
employing the two micron origin of replication are 
5 common (Broach, 1983, Meth . Enz. i01:307), other 
plasmid vectors suitable for yeast expression are known 
(see, for exeunple, Stinchcomb e^ Al* / 1979, Nature 
282:39; Tschempe ^ al* ^ 1980, Gene lfi:157; and Clarke 
et al> » 1983, Meth . Enz . 101 x300) ♦ Control sequences 
10 for yeast vectors include promoters for the synthesis 
of glycolytic enzymes (Hess et al» # 1968, Adv . 
Enzvme Reg . 7:149; Holland et al* / 1978, Biotechnoloov 
17:4900; and Holland s£ al • f 1981, J. Biol. Chem . 
256 :1385) . Additional promoters known in the art 
15 include the promoter for 3-phosphoglycerate kinase 
(Hitzeman et al. , 1980, J. Biol . Chem . 255 :2073) and 
those for other glycolytic enzymes, such as 
glyceraldehyde 3 -phosphate dehydrogenase, hexokinase , 
pyruvate decarboxylase , phosphof ructokinase , glucose-6- 
20 phosphate isomerase, 3-phosphoglycerate mutase, 
pyruvate kinase, triosephosphate isomerase, 

phosphoglucose isomerase, and glucokinase. Other 
promoters that have the additional advantage of 
transcription controlled by growth conditions are the 
25 promoter regions for alcohol dehydrogenase 2, 
isocytochrome C, acid phosphatase, degradative enzymes 
associated with nitrogen metabolism, and enzymes 
responsible for maltose and galactose utilization 

(Holland, supra ) . 

30 Terminator sequences may also be used to enhance 

expression when placed at the 3' end of the coding 
sequence . Such terminators are found in the 3 ' 
untranslated region following the coding sequences in 
yeast-derived genes. Any vector containing a 

3 5 yeast-compatible promoter, origin of replication, and 
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other control sequences is suitable for use in 
constructing yeast expression vectors for the 
thermostable DNA polymerases of the present invention. 
The nucleotide sequences which code for the 
5 thermostable DNA polymerases of the present invention 
can also be expressed in eucaryotic host cell cultures 
derived from multicellular organisms. See^ for 
example. Tissue Culture ■ Academic Press, Cruz and 
Patterson, editors (1973). Useful host cell lines 

10 include COS-7, COS-A2, CV-1, murine cells such as 
murine myelomas N51 and VERO, HeLa cells, and Chinese 
hamster ovary (CHO) cells. Expression vectors for such 
cells ordinarily include promoters and control 
sequences compatible with mammalian cells such as, for 

15 example, the commonly used early and late promoters 
from Simian Virus 40 (SV 40) (Piers et al. , 1978, 
^^"^^y^ 273:113), or other viral promoters such as those 
derived from polyoma, adenovirus 2, bovine papilloma 
virus (BPV) , or avian sarcoma viruses, or 

20 immunoglobulin promoters and heat shock promoters • A 
system for expressing DNA in mammalian systems using a 
BPV vector system is disclosed in U.S. Patent No. 
4,419,446. A modification of this system is described 
in U.S. Patent No. 4,601,978. General aspects of 

25 maiamalian cell host system transformations have been 
described by Axel, U.S. Patent No. 4,399,216. 
"Enhancer" regions are also important in optimizing 
expression; these are, generally, sequences found 
upstream of the promoter region. Origins of 

30 replication may be obtained, if needed, from viral 
sources. However, integration into the chromosome is a 
common mechanism for DNA replication in eucaryotes. 

Plant cells can also be used as hosts, and control 
sequences compatible with plant cells, such as the 

35 nopaline synthase promoter and polyadenylation signal 
sequences (Depicker et al. , 1982, J. Mol. ApdI . Gen . 
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1:561) are available. Expression sys'tems employing 
insect cells utilizing the control systems provided by 
baculovirus vectors have also been described (Miller et 
al. . 1986, Genetic Engineering (Setlow et al . , eds., 
5 Plenum Publishing) 277-297) . Insect cell-based 

expression can be accomplished in Spodoptera 
f rugipeida . These systems can also be used to produce 
recombinant thermostable polymerases of the present 
invention. 

10 Depending on the host cell used, transformation is 

done using standard techniques appropriate to such 
cells. The calcium treatment employing calcium 
chloride, as described by Cohen, 1972, Proc . Natl . 
Acad . Sci . USA 69:2110 is used for procaryotes or other 

15 cells that contain substantial cell wall barriers. 
Infection with Agrobacterium tumef aciens (Shaw et al. , 
1983, Gene 23.: 315) is used for certain plant cells. 
For mammalian cells, the calcium phosphate 
precipitation method of GraQiam and van der Eb, 1978, 

20 Virology 52:546 is preferred. Transformations into 
yeast are carried out according to the method of Van 
Solingen et al . , 1977, J. Bact. 130 :946 and Hsiao et 
al. , 1979, Proc . Natl. Acad . Sci . USA 76:3829. 

Once the desired thermostable DNA polymerase with 
25 altered 5' to 3' exonuclease activity has been 
expressed in a recombinant host cell, purification of 
the protein may be desired. Although a variety of 
purification procedures can be used to purify the 
recombinant thermostable polymerases of the invention, 
30 fewer steps may be necessary to yield an enzyme 
preparation of equal purity. Because E. coli host 
proteins are heat-sensitive, the recombinant 
thermostable DNA polymerases of the invention can be 
substantially enriched by heat inactivating the crude 
35 lysate. This step is done in the presence of a 
sufficient amount of salt (typically 0.2-0.3 M ammonium 
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sulfate) to ensure dissociation of the thermostable DNA 
polymerase from the host DNA and to reduce ionic 
interactions of thermostable DNA polymerase with other 
cell lysate proteins. 
5 In addition, the presence of 0.3 M ammonium sulfate 

promotes hydrophobic interaction with a phenyl 
sepharose coli2mn. Hydrophobic interaction 

chromatography is a separation technique in which 
substances are separated on the basis of differing 

10 strengths of hydrophobic interaction with an xincharged 
bed material containing hydrophobic groups. Typically, 
the column is first equilibrated under conditions 
favoreODle to hydrophobic binding, such as high ionic 
strength. A descending salt gradient may then be used 

15 to elute the sample. 

According to the invention, an aqueous mixture 
(containing the recombinant thermostable DNA polymerase 
with altered 5' to 3' exonuclease activity) is loaded 
onto a column containing a relatively strong 

20 hydrophobic gel such as phenyl sepharose (manufactured 
by Pharmacia) or Phenyl TSK (manufactured by Toyo 
Soda) . To promote hydrophobic interaction with a 
phenyl sepharose colximn, a solvent is used that 
contains, for example, greater than or equal to 0.3 M 

25 ammonium sulfate, with 0.3 M being preferred, or 
greater than or equal to 0.5 M NaCl. The column and 
the Sconple are adjusted to 0.3 M ammonium sulfate in 50 
mM Tris (pH 7.5) and 1.0 mM EDTA ("TE") buffer that 
also contains 0.5 mM DTT, and the sample is applied to 

3 0 the column. The colxamn is washed with the 0.3 M 
ammonium sulfate buffer. The enzyme may then be eluted 
with solvents that attenuate hydrophobic interactions, 
such as decreasing salt gradients, ethylene or 
propylene glycol, or urea. 

35 For long-term stability, the thermostable DNA 

polymerase enzymes of the present invention can be 
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s'tored in a buffer Uial: contiains one or more non-ionic 
polymeric detergents. Such detergents are generally 
those that have a molecular weight in the range of 
approximately 100 to 250,000 daltons, preferably eibout 
5 4,000 to 200,000 daltons, and stabilize the enzyme at a 
pH of from about 3.5 to about 9^5, preferably from 
about 4 to 8.5. Exeunples of such detergents include 
those specified on pages 295-298 of NcCutcheon's 
Emulsifiers & Detergents . North American edition 

10 (1983) , published by the McCutcheon Division of MC 
Publishing Co., 175 Rock Road, Glen Rock, NJ (USA) and 
copending Serial No. 387,003, filed July 28, 1989, each 
of which is incorporated herein by reference. 

Preferably, the detergents are selected from the 

15 group comprising ethoxylated fatty alcohol ethers and 
lauryl ethers, ethoxylated alkyl phenols, octylphenoxy 
polyethoxy ethanol compounds, modified oxyethylated 
and/or oxypropylated straight-chain alcohols, 
polyethylene glycol monooleate compounds, polysorbate 

20 compounds, and phenolic fatty alcohol ethers. More 
particularly preferred are Tween 20, a polyoxyethylated 
(20) sorbitan monolaurate from ICI Americas Inc. , 
Wilmington, DE, and Iconol NP-40, an ethoxylated alkyl 
phenol ( nonyl ) from BASF Wyandotte Corp . , Pars ippany , 

25 NJ. 

The thermostable enzymes of this invention may be 
used for any purpose in which such enzyme activity is 
ecessary or desired. 

DNA sequencing by the Sanger dideoxynucleotide 

30 method (Sanger et al. , 1977, Proc. Natl. Acad. Sci. USA 
74:5463-5467) has undergone significant refinement in 
recent years, including the development of novel 
vectors (Yanisch-Perron et al. , 1985, Gene 33.: 103-119) , 
base analogs (Mills et al. , 1979, Proc . Natl , Acad . 

3 5 Sci. USA 76:2232-2235, and Barr et al. , 1986, 
BioTechnicmes 4:428-432), enzymes (Tabor et al . , 1987, 
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H^tl t ft<?a<3 » — Sci . USA £4:4763-4771, and Innis, 
M.A. gt al-/ 1988, Proc. Natl. Acad. Set. osA 
£5:9436:9440), and instruments for partial automation 
of DMA sequence analysis (Smith et ai. , 1986, Nature 
5 321:674-679; Prober et al- , 1987, Science 2a£:336-341; 
and Ansorge gt al- , 1987, Nuc. Acids Res. 
15:4593-4602). The basic dideoxy sequencing procedure 
involves (i) annealing an oligonucleotide primer to a 
suitable single or denatured double stranded DNA 
10 template; (ii) extending the primer with DNA polymerase 
in four separate reactions, each containing one 
o-labeled dNTP or ddNTP (alternatively, a labeled 
primer can be used) , a mixture of unlabeled dNTPs, and 
one chain-terminating dideoxynucleotide-5' -triphosphate 
15 (ddNTP) ; (iii) resolving the four sets of reaction 
products on a high-resolution polyacrylamide-urea gel; 
and (iv) producing an autoradiographic image of the gel 
that can be examined to infer the DNA sequence. 
Alternatively, fluorescently labeled primers or 
20 nucleotides can be used to identify the reaction 
products. Known dideoxy sequencing methods utilize a 
DNA polymerase such as the Klenow fragment of fij,. coli 
DNA polymerase l, reverse transcriptase. Tag DNA 
polymerase, or a modified T7 DNA polymerase. 
25 The introduction of commercial kits has vastly 

simplified the art, making DNA sequencing a routine 
technique for any laboratory. However, there is still 
a need in the art for sequencing protocols that work 
well with nucleic acids that contain secondary 
30 structure such as palindromic hairpin loops and with 
G+C-rich DNA. Single stranded DNAs can form secondary 
structiire, such as a hairpin loop, that can seriously 
interfere with a dideoxy sequencing protocol, both 
through improper termination in the extension reaction, 
35 or in the case of an enzyme with 5' to 3' exonuclease 
activity, cleavage of the template strand at the 
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juncture of the hairpin. Since high t:enperat:ure 
destabilizes secondary structure, the ability to 
conduct the extension reaction at a high temperature, 
i.B., 70-75*C, with a thermostable DNA polymerase 
5 results in a significant improvement in the sequencing 
of DNA that contains such secondary stmcttire. 
However, temperatures compatible with polymerase 
extension do not eliminate all secondary structure. A 
5 ' to 3 ' exonuclease-def icient thermost2d>le DNA 

10 polymerase would be a further improvement in the art, 
since the polymerase could synthesize through the 
hairpin in a strand displacement reaction, rather than 
cleaving the template, resulting in an improper 
termination, i.e., an extension run-off fragment. 

15 As an alternative to basic dideoxy sequencing, 

cycle dideoxy sequencing is a linear, asymmetric 
amplification of target sequences in the presence of 
dideoxy chain terminators. A single cycle produces a 
family of extension products of all possible lengths. 

20 Following denaturation of the extension reaction 
product from the DNA template, multiple cycles of 
primer annealing and primer extension occur in the 
presence of dideoxy terminators. The process is 
distinct from PGR in that only one primer is used, the 

25 growth of the sequencing reaction products in each 
cycle is linear, and the amplification products are 
heterogeneous in length and do not serve as template 
for the next reaction. Cycle dideoxy sequencing is a 
technique providing advantages for IcdDoratories using 

30 automated DNA sequencing instruments and for other high 
volxime sequencing leiboratories . It is possible to 
directly sequence genomic DNA, without cloning, due to 
the specificity of the technique and the increased 
amount of signal generated. Cycle sequencing protocols 

35 accommodate single and double stranded templates, 
including genomic, cloned, and PCR-amplif ied templates. 
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e DNA 

advantages in cycle sequencing: they tolerate the 
stringent annealing temperatures which are required for 
specific hybridization of primer to genomic targets as 
5 well as tolerating the multiple cycles of high 
temperature denaturation which occur in each cycle. 
Performing the extension reaction at high temperatures, 
i.e., 70-75 -C, results in a significant improvement in 
sequencing results with DNA that contains secondary 
10 structure, due to the destabilization of secondary 
structure. However, such temperatures will not 
eliminate all secondary structure. A 5' to 3' 
exonuclease-deficient thermostable DNA polymerase would 
be a further improvement in the art, since the 
15 polymerase could synthesize through the hairpin in a 
strand displacement reaction, rather than cleaving the 
template and creating an improper termination. 
Additionally, like PGR, cycle sequencing suffers from 
the phenomenon of product strand renaturation. In the 
20 case of a thermostable DNA polymerase possessing 5' to 
3' exonuclease activity, extension of a primer into a 
double stranded region created by product strand 
renaturation will result in cleavage of the renatured 
complementary product strand. The cleaved strand will 
25 be shorter and thus appear as an improper termination. 
In addition, the correct, previously synthesized 
termination signal will be attenuated. A thermostable 
DNA polymerase deficient in 5' to 3' exonuclease 
activity will improve the art, in that such extension 
30 product fragments will not be formed. A variation of 
cycle sequencing, involves the simultaneous generation 
of sequencing ladders for each strand of a double 
stranded template while sustaining some degree of 
amplification (Ruano and Kidd, Proc. Natl. Acad. Sci. 
35 SSA, 1991 88:2815-2819). This method of coupled 
amplification and sequencing would benefit in a similar 
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fashion as stranded cycle sequencing from ^e use of a 
-thermostable DMA polymerase deficient in 5' to 3' 
exoniiclease activity. 

In a particularly preferred embodiment, the enzymes 
5 in which the 5' to 3' exonuclease activity has been 
reduced or eliminated catalyze the nucleic acid 
£tmplif ication reaction known as PGR, and as stated 
ahove, with the resultant effect of producing a better 
yield of desired product than is achieved with the 

10 respective native enzymes which have greater amounts of 
the 5' to 3' exonuclease activity • Improved yields are 
the result of the inability to degrade previously 
synthesized product caused by 5' to 3' exonuclease 
activity. This process for amplifying nucleic acid 

15 sequences is disclosed and claimed in U.S. Patent Nos. 
4,683,202 and 4,865,188, each of which is incorporated 
herein by reference. The PGR nucleic acid 
amplification method involves amplifying at least one 
specific nucleic acid sec[uence contained in a nucleic 

20 acid or a mixture of nucleic acids and in the most 
common embodiment, produces double-stranded DNA. Aside 
from improved yields, thermostable DNA polymerases with 
attenuated 5' to 3' exonuclease activity exhibit an 
improved ability to generate longer PGR products, an 

25 improved eO^ility to produce products from G+C-rich 
templates and an improved ability to generate PGR 
products and DNA sequencing ladders from templates with 
a high degree of secondary structure. 

For ease of discussion, the protocol set forth 

30 below assumes that the specific sequence to be 
amplified is contained in a double-stranded nucleic 
acid. However, the process is equally useful in 
amplifying single-stranded nucleic acid, such as mRNA, 
although in the preferred embodiment the ultimate 

35 product is still double-stranded DNA. In the 
amplification of a single-stranded nucleic acid, the 
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first step involves the synthesis of a complementary 
strand (one of the two amplification primers can be 
used for this pxirpose) , and the succeeding steps 
proceed as in the double-stranded eunplif ication process 
5 described below. 

This simplification process comprises the steps of: 



(a) contacting each nucleic acid strand with four 
10 different nucleoside triphosphates and two 

oligonucleotide primers for each specific sequence 
being amplified, wherein each primer is selected to be 
substantially complementary to the different strands of 
the specific sequence, such that the extension product 

15 synthesized from one primer, when separated from its 
complement, can sezve as a template for synthesis of 
the extension product of the other primer, said 
contacting being at a temperature that allows 
hybridization of each primer to a complementary nucleic 

20 acid strand; 

(b) contacting each nucleic acid strand, at the 
same time as or after step (a) , with a thermost€±>le DNA 
polymerase of the present invention that enables 
combination of the nucleoside triphosphates to form 

25 primer extension products complementary to each strand 
of the specific nucleic acid sequence; 

(c) maintaining the mixture from step (b) at an 
effective temperature for an effective time to promote 
the activity of the enzyme and to synthesize, for each 

30 different sequence being amplified, an extension 
product of each primer that is complementary to each 
nucleic acid strand template, but not so high as to 
separate each extension product from the complementary 
strand template; 

35 (d) heating the mixture from step (c) for an 

effective time and at an effective temperature to 
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separat:e 'the primer extension products from the 
templates on which they were synthesized 'to produce 
single-stranded molecules but not so high as to 
denature irreversibly the enzyme; 
5 (e) coolihg the mixture from step (d) for an 

effective time and to an effective temperature to 
promote hybridization of a primer to each of the 
single-stranded molecules produced in step (d) ; and 

(f) maintaining the mixture from step (e) at an 
10 effective temperature for an effective time to promote 
the activity of the enzyme and to synthesize, for each 
different secjuence being amplified, an extension 
product of each primer that is complementary to each 
nucleic acid template produced in step (d) but not so 
15 high as to separate each extension product from the 
complementary strand template. The effective times and 
temperatures in steps (e) and (f) may coincide, so that 
steps (e) and (f) can be carried out simultaneously. 
Steps (d)-(f) are repeated until the desired level of 
20 amplification is obtained. 

The amplification method is useful not only for 
producing large amounts of a specific nucleic acid 
sequence of known sequence but also for producing 
nucleic acid sequences that are known to exist but are 
25 not completely specified. One need know only a 
sufficient number of bases at both ends of the sequence 
in sufficient detail so that two oligonucleotide 
primers can be prepared that will hybridize to 
different strands of the desired sequence at relative 
30 positions along the sequence such that an extension 
product synthesized from one primer, when separated 
from the template (complement) , can serve as a template 
for extension of the other primer into a nucleic acid 
sequence of defined length. The greater the knowledge 
35 about the bases at both ends of the sequence, the 
greater can be the specificity of the primers for the 
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target: nucleic acid sequence and the efficiency of the 
process and specificity of the reaction. 

In any case, an initial copy of the sequence to be 
amplified must be available, although the sequence need 
5 not be pure or a discrete molecule. In general, the 
amplification process involves a chain reaction for 
producing, in exponential quantities relative to the 
number of reaction steps involved, at least one 
specific nucleic acid sequence given that (a) the ends 

10 of the rec[uired sequence are Icnown in sufficient detail 
that oligonucleotides can be synthesized that will 
hybridize to them and (b) that a small amount of the 
sequence is availeODle to initiate the chain reaction. 
The product of the chain reaction will be a discrete 

15 nucleic acid duplex with termini corresponding to the 
5' ends of the specific primers employed. 

Any nucleic acid sequence, in purified or 
nonpurified form, can be utilized as the starting 
nucleic acid(s), provided it contains or is suspected 

20 to contain the specific nucleic acid sequence one 
desires to amplify. The nucleic acid to be amplified 
can be obtained from any source, for example, from 
plasmids such as pBR322, from cloned DNA or RNA, or 
from natural DNA or RNA from any source, including 

25 bacteria, yeast, viruses, organelles, and higher 
organisms such as plants and animals. DNA or RNA may 
be extracted from blood, tissue material such as 
chorionic villi, or amniotic cells by a variety of 
techniques. See, e.g., Maniatis et al . , 1982, 

30 Molecular Clo ning; A Laboratory Manual (Cold Spring 
Harbor I*abor atoiry # Cold Spring Harbor, NY) 
pp. 280-281. Thus, the process may employ, for 
example, DNA or RNA, including messenger RNA, which DNA 
or RNA may be single-stranded or double-stranded. In 

35 addition, a DNA-RNA hybrid that contains one strand of 
each may be utilized. A mixture of any of these 
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nucleic acids can also be employed as can nucleic acids 
produced from a previous amplification reaction (using 
the same or different primers) • The specific nucleic 
acid sequence to be amplified can be only a fraction of 
5 a large molecule or can be present initially as a 
discrete molecule, so that the specific sequence 
constitutes the entire nucleic acid. 

The sequence to be amplified need not be present 
initially in a pure form; the sequence can be a minor 
10 fraction of a complex mixture, such as a portion of the 
P-globin gene contained in whole human DNA (as 
exemplified in Saiki ^ al» , 1985, Science 
2^:1530-1534) or a portion of a nucleic acid sequence 
due to a particular microorganism, which organism might 
15 constitute only a very minor fraction of a particular 
biological sample. The cells can be directly used in 
the amplification process after suspension in hypotonic 
buffer and heat treatment at about 90*C-100*C until 
cell lysis and dispersion of intracellular components 
20 occur (generally 1 to 15 minutes) . After the heating 
step, the amplification reagents may be added directly 
to the lysed cells. The starting nucleic acid sequence 
can contain more than one desired specific nucleic acid 
sequence. The amplification process is useful not only 
25 for producing large amounts of one specific nucleic 
acid sequence but also for amplifying simultaneously 
more than one different specific nucleic acid sequence 
located on the s€uae or different nucleic acid molecules. 
Primers play a key role in the PCR process . The 
30 word "primer" as used in describing the amplification 
process can refer to more than one primer, particularly 
in the case where there is some ambiguity in the 
information regarding the terminal sec[uence(s) of the 
fragment to be amplified or where one employs the 
35 degenerate primer process described in PCT Application 
No. 91/05753, filed August 13, 1991. For instance, in 
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tJie case where a nucleic acid sequence is inferred from 
protein sequence information, a collection of primers 
containing sequences representing all possible codon 
variations based on degeneracy of the genetic code Ccm 
5 be used for each strand. One primer from this 
collection will be sufficiently homologous with a 
portion of the desired sequence to be amplified so as 
to be useful for €unplif ication. 

In addition, more than one specific nucleic acid 

10 sequence can be amplified from the first nucleic acid 
or mixtiire of nucleic acids, so long as the appropriate 
number of different oligonucleotide primers are 
utilized. For example, if two different specific 
nucleic acid sequences are to be produced, four primers 

15 are utilized. Two of the primers are specific for one 
of the specific nucleic acid sequences, and the other 
two primers are specific for the second specific 
nucleic acid sequence. In this manner, each of the two 
6.±tfBTBnt specific sec[uences can be produced 

20 exponentially by the present process. 

A sequence within a given sequence can be amplified 
after a given number of amplification cycles to obtain 
greater specificity in the reaction by adding, after at 
least one cycle of eonplif ication, a set of primers that 

25 are complementary to internal sequences (i.e., 
sequences that are not on the ends) of the sequence to 
be amplified. Such primers can be added at any stage 
and will provide a shorter amplified fragment. 
Alternatively, a longer fragment can be prepared by 

3 0 using primers with non-complementary ends but having 
some overlap with the primers previously utilized in 
the €UQpl if ication. 

Primers also play a key role when the amplification 
process is used for jji vitro mutagenesis. The product 

35 of an amplification reaction where the primers employed 
are not exactly complementary to the original template 
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will contain the sequence of the primer rather than the 
template, so introducing an Xn vitro mutation. In 
further cycles, this mutation will be cunplified with an 
undiminished efficiency because no further mispaired 
5 priming is required. The process of making an altered 
DNA sequence as described above could be repeated on 
the altered DNA using different primers to induce 
further sequence changes. In this way, a series of 
mutated sequences can gradually be produced wherein 
10 each new addition to the series differs from the last 
in a minor way, but from the original DNA source 
sequence in an increasingly major way. 

Because the primer can contain as part of its 
sequence a non-complementary sequence, provided that a 
15 sufficient amount of the primer contains a sequence 
that is complementary to the strand to be amplified, 
many other advantages can be realized. For exsunple, a 
nucleotide sequence that is not complementary to the 
template sequence (such as, e.g., a promoter, linker, 
20 coding sequence, etc) may be attached at the 5' end of 
one or both of the primers and so appended to the 
product of the amplification process. After the 
extension primer is added, sufficient cycles are run to 

achieve the desired amount of new template containing 
25 the non-complementary nucleotide insert. This allows 
production of large quantities of the combined 

fragments in a relatively short period of time (e.g., 
two hours or less) using a simple technique. 

Oligonucleotide primers can be prepared using any 

30 suitcible method, such as, for example, the 
phosphotriester and phosphodiester methods described 
above, or automated embodiments thereof. In one such 
automated embodiment, diethylphosphorcunidites are used 
as starting materials and can be synthesized as 

35 described by Beaucage et ai» f 1981, Tetrahedron Letters 
22:1859-1862. One method for synthesizing 
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ollgonucleotldes on a modified solid support is 
described in U.S. Patent No. 4,458,066. One aan also 
use a primer that has been isolated from a biological 
source (such as a restriction endonuclease digest) . 
5 No matter what primers are used, however, the 
reaction mixture must contain a template for PGR to 
occur, because the specific nucleic acid sequence is 
produced by using a nucleic acid containing that 
sequence as a template. The first step involves 

10 contacting each nucleic acid strand with four different 
nucleoside triphosphates and two oligonucleotide 
primers for each specific nucleic acid sequence being 
eunplified or detected. If the nucleic acids to be 
amplified or detected are DNA, then the nucleoside 

15 triphosphates are usually dATP, dCTP, dGTP, and dTTP, 
although various nucleotide derivatives can also be 
used in the process. For excimple, when using PGR for 
the detection of a known sequence in a sample of 
unknown sequences, dTTP is often replaced by dUTP in 

20 order to reduce contamination between samples as taught 
in PCT Application No. 91/05210 filed July 23, 1991, 
incosrporated herein by reference. 

The concentration of nucleoside triphosphates can 
vary widely. Typically, the concentration is 50 to 200 

2s yK in each dNTP in the buffer for amplification, and 
MgCl2 is present in the buffer in an amovmt of 1 to 3 
mM to activate the polymerase and increase the 
specificity of the reaction. However, dNTP 
concentrations of 1 to 20 |iM may be preferred for some 

30 applications, such as DNA sequencing or generating 
radiolabeled probes at high specific activity. 

The nucleic acid strands of the target nucleic acid 
serve as templates for the synthesis of additional 
nucleic acid strands, which are extension products of 

35 the primers. This synthesis can be performed using any 
suitable method, but generally occurs in a buffered 
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aqueous solution, pre£er€d>ly at a pH of 7 to 9, most 
preferably about 8. To facilitate synthesis, a molar 
excess of the two oligonucleotide primers is added to 
the buffer containing the template strands. As a 
5 practical matter, the amount of primer added will 
generally be in molar excess over the sunoiint of 
complementary strand (template) when the sequence to be 
amplified is contained in a mixture of complicated 
long-chain nucleic acid strands. A large molar excess 
10 is preferred to improve the efficiency of the process. 
Accordingly, primer: template ratios of at least 1000:1 
or higher are generally employed for cloned DNA 
templates, and primer: template ratios of about 10^:1 
or higher are generally employed for eunplif ication from 
15 complex genomic seunples. 

The mixture of template, primers, and nucleoside 
triphosphates is then treated according to whether the 
nucleic acids being amplified or detected are doxible- 
or single-stranded. If the nucleic acids are 
20 single- stranded, then no denaturation step need be 
employed prior to the first extension cycle, and the 
reaction mixture is held at a temperature that promotes 
hybridization of the primer to its complementary target 
(template) sequence. Such temperature is generally 
25 from about 35 'C to 65 or more, preferably about 37 'C 
to 60" C for an effective time, generally from a few 
seconds to five minutes, preferably from 30 seconds to 
one minute. A hybridization temperature of 35 "C to 
70'C may be used for 5' to 3' exonuclease mutant 
30 thermostable DNA polymerases. Primers that are 15 
nucleotides or longer in length are used to increase 
the specificity of primer hybridization. Shorter 
primers require lower hybridization temperatures. 

The complement to the original single-stranded 
35 nucleic acids can be synthesized by adding the 
thermostable DNA polymerase of the present invention in 
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the presence of the appropriate buffer, dNTPs, and one 
or more oligonucleotide primers. If an appropriate 
single primer is added, the primer extension product 
will be complementary to the single-stranded nucleic 
5 acid emd will be hybridized with the nucleic acid 
strand in a duplex of strands of equal or unequal 
length (depending on where the primer hybridizes to the 
template) , which may then be separated into single 
strands as described above to produce two single, 

10 separated, complementary strands. A second primer 
would then be added so that subsequent cycles of primer 
extension would occur using both the original 
single-stranded nucleic acid and the extension product 
of the first primer as templates. Alternatively, two 

15 or more appropriate primers (one of which will prime 
synthes is us ing the extens ion product o f the other 
primer as a template) can be added to the 
single-stranded nucleic acid and the reaction carried 
out. 

20 If the nucleic acid contains two strands, as in the 

case of amplification of a doxible-stranded target or 
second-cycle amplification of a single-stranded target, 
the strands of nucleic acid must be separated before 
the primers are hybridized. This strand separation can 

25 be accomplished by any suitable denaturing method, 
including physical, chemical or enzymatic means. One 
preferred physical method of separating the strands of 
the nucleic acid involves heating the nucleic acid 
until complete (>99%) denaturation occurs. Typical 

3 0 heat denaturation involves temperatures ranging from 
about 80 •€ to 105 •€ for times generally ranging from 
about a few seconds to minutes, depending on the 
composition and size of the nucleic acid. Preferably, 
the effective denaturing temperature is 90'C-100'C for 

35 a few seconds to 1 minute. Strand separation may also 
be induced by an enzyme from the class of enzymes known 
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as hel leases or the enzyme RecA, which has hel lease 
activity and In the presence of ATP Is known to 
denature DNA. The reaction conditions suitable for 
separating the strands of nucleic acids with hellcases 
5 are described by Kuhn Hof fmann-Berllng, 1978, 
CSH-Ouan 1 1 1 a 1 1 ve Biol ocfv 43:63, and techniques for 
using RecA are reviewed In Raddlng, 1982, Ann . Rev . 
Genetics 16:405-437. The denaturatlon produces two 
separated cojaplementary strands of equal or unequal 
10 length. 

If the double-stranded nucleic acid is denatured by 
heat, the reaction mixture is allowed to cool to a 
temperature that promotes hybridization of each primer 
to the complementary target (template) sequence. This 

15 temperature is usually from about 35 'C to 65 *€ or more, 
depending on reagents, preferably 37 'C to 60 'C. The 
hybridization temperature is maintained for an 
effective time, generally a few seconds to minutes, and 
preferalDly 10 seconds to 1 minute. In practical terms, 

20 the temperature is simply lowered from about 95 to as 
low as 37*C, and hybridization occurs at a temperature 
within this range. 

Whether the nucleic acid is single- or 
double-stranded, the thermostable DNA polymerase of the 

25 present invention can be added prior to or during the 
denaturatlon step or when the temperature is being 
reduced to or is in the range for promoting 
hybridization. Although the thermostability of the 
polymerases of the invention allows one to add such 

30 polymerases to the reaction mixture at any time, one 
can substantially inhibit non-specific amplification by 
adding the polymerase to the reaction mixture at a 
point in time when the mixture will not be cooled below 
the stringent hybridization temperature. After 

35 hybridization, the reaction mixture is then heated to 
or maintained at a temperature at which the activity of 
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temperature sufficient to increase the activity of the 
enzyme in facilitating synthesis of the primer 
esctension products from the hybridized primer and 
5 template. The temperatxire must actually be sufficient 
to synthesize an extension product of each primer that 
is complementary to each nucleic acid template, but 
must not be so high as to denature each extension 
product from its complementary template (i.e., the 
10 temperature is generally less than cO^out 80'C to 90*C) . 

Depending on the nucleic acid(s) employed, the 
typical temperature effective for this synthesis 
reaction generally ranges from about 40 to 80 'C, 
preferably 50 -C to 75 'C. The temperature more 
15 preferably ranges from about 65 'C to 75 'C for the 
thermostable DNA polymerases of the present invention. 
The period of time required for this synthesis may 
range from about 10 seconds to several minutes or more, 
depending mainly on the temperature, the length of the 
20 nucleic acid, the enzyme, and the complexity of the 
nucleic acid mixture. The extension time is usually 
about 30 seconds to a few minutes. If the nucleic acid 
is longer, a longer time period is generally required 
for complementary strand synthesis. 
25 The newly synthesized strand and the complement 

nucleic acid strand form a double-stranded molecule 
that is used in the succeeding steps of the 
amplification process. In the next step, the strands 
of the double-stranded molecule are separated by heat 
30 denaturation at a temperature and for a time effective 
to denature the molecule, but not at a temperature and 
for a period so long that the thermostable enzyme is 
completely and irreversibly denatured or inactivated. 
After this denaturation of template, the temperature is 
35 decreased to a level that promotes hybridization of the 
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primer "to the coinpleinen1:ary single-s'tranded molecule 
(templa'te) produced from the previous step, as 
described above. 

After this hybridization step, or concurrently with 
5 the hybridization step, the temperature is adjusted to 
a temperature that is effective to promote the activity 
of the thermostable enzyme to enable synthesis of a 
primer extension product using as a template both the 
newly synthesized and the original strands. The 

10 temperature again must not be so high as to separate 
(denature) the extension product from its template, as 
described above. Hybridization may occur during this 
step, so that the previous step of cooling after 
denaturation is not required . In such a case , using 

15 simultaneous steps, the preferred temperature range is 
50'C to 70*c. 

The heating and cooling steps involved in one cycle 
of strand separation, hybridization, and extension 
product synthesis can be repeated as many times as 

20 needed to produce the desired quantity of the specific 
nucleic acid sequence. The only limitation is the 
eunount of the primers, thermostable enzyme, and 
nucleoside triphosphates present. Usually, from 15 to 
30 cycles are completed. For diagnostic detection of 

25 amplified DNA, the number of cycles will depend on the 
nature of the sample, the initial target concentration 
in the sample and the sensitivity of the detection 
process used after amplification. For a given 
sensitivity of detection, fewer cycles will be required 

3 0 if the sample being amplified is pure and the initial 
target concentration is high. If the sample is a 
complex mixture of nucleic acids and the initial target 
concentration is low, more cycles will be required to 
amplify the signal sufficiently for detection. For 

35 general amplification and detection, the process is 
repeated about 15 times. When amplification is used to 
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genera^e sequences to be detected with leOieled 
sequence-specific probes and when human genomic DNA is 
the target of amplification, the process is repeated 15 
to 30 times to amplify the sequence sufficiently so 
5 that a clearly detectable signal is produced, i.e., so 
that background noise does not interfere with detection. 

No additional nucleotides, primers, or thermostable 
enzyme need be added after the initial addition, 
provided that no key reagent has been exhausted and 
10 that the enzyme has not become denatured or 
irreversibly inactivated, in which case additional 
polymerase or other reagent would have to be added for 
the reaction to continue. After the appropriate number 
of cycles has been completed to produce the desired 
15 amoxant of the specific nucleic acid sequence, the 
reaction can be halted in the usual manner, e.g., by 
inactivating the enzyme by adding EDTA, phenol, SDS, or 
CHCI3 separating the components of the reaction. 

The amplification process can be conducted 
20 continuously. in one embodiment of an automated 
process, the reaction mixture can be temperature cycled 
such that the temperature is programmed to be 
controlled at a certain level for a certain time. One 
such instrument for this purpose is the automated 
25 machine for handling the amplification reaction 
developed and marketed by Perkin-Elmer Cetus 
Instruments. Detailed instructions for carrying out 
PGR with the instrument are available upon purchase of 
the instrument. 

30 The thermostable DNA polymerases of the present 

invention with altered 5' to 3' exonuclease activity 
are very useful in the diverse processes in which 
amplification of a nucleic acid sequence by PGR is 
useful. The amplification method may be utilized to 

35 clone a particular nucleic acid sequence for insertion 
into a suitable expression vector, as described in U.S. 
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Patent No. 4,800,159. The vector may be used to 
transform an appropriate host organism to produce the 
gene product of the sequence by standard methods of 
recombinant DNA technology. Such cloning may involve 
5 direct ligation into a vector using blunt-end ligation, 
or use of restriction enzymes to cleave at sites 
contained within the primers. Other processes suitable 
for the thermosteible DNA polymerases of the present 
invention include those described in U.S. Patent Nos. 

10 4,683,195 and 4,683,202 and European Patent Publication 
Nos. 229,701; 237,362; and 258,017; these patents and 
publications are incorporated herein by reference. In 
addition, the present enzyme is useful in asyxametric 
PGR (see Gyllensten and Erlich, 1988, Proc . Natl. £o^. 

15 Sci . USA fi5: 7652-7656, incorporated herein by 
reference); inverse PGR (Ochman et al. , 1988, Genetics 
120 ;621. incorporated herein by reference); and for DNA 
sequencing (see Xnnis et al. , 1988, Proc . Natl . Acad . 
Sci . USA 85:9436-9440, and McConlogue et ai. , 1988, 

20 Nuc. Acids Res . 16 (20) : 9869) , random amplification of 
cDNA ends (RACE) , random priming PGR which is used to 
amplify a series of DNA fragments, and PGR processes 
with single sided specificity such as anchor PGR and 
ligat ion-mediated anchor PGR as described by Loh, E. in 

25 METHODS; A Gompanion to Methods in Enzvmolocrv (1991) 2: 
pp. 11-19. 

An additional process in which a 5 ' to 3 ' 
exonuclease deficient thermostable DNA polymerase would 
be useful is a process referred to as polymerase ligase 

30 chain reaction (PLGR) • As its name suggests, this 
process combines features of PGR with features of 
ligase chain reaction (LGR) . 

PLGR was developed in part as a technique to 
increase the specificity of allele-specif ic PGR in 

35 which the low concentrations of dNTPs utilized (-'I >iM) 
limited the extent of amplification. In PLGR, DNA is 
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denatured and tour compleinentary , but not adjacent, 
oligonucleotide primers are added with dNTPs. a 
thermostable DNA polymerase and a thermostable ligase. 
The primers anneal to target DNA in a non-adjacent 
5 fashion and the thermostable DNA polymerase causes the 
addition of appropriate dNTPs to the 3' end of the 
downstream primer to fill the gap between the 
non-adjacent primers and thus ' render the primers 
adjacent. The thermostable ligase will then ligate the 
10 two adjacent oligonucleotide primers. 

However, the presence of 5' to 3' exonuclease 
activity in the thermostable DNA polymerase 
significantly decreases the probability of closing the 
gap between the two primers because such activity 
15 causes the excision of nucleotides or small 
oligonucleotides from the 5' end of the downstream 
primer thus preventing ligation of the primers. 
Therefore, a thermostable DNA polymerase with 
attenuated or eliminated 5' to 3' exonuclease activity 
20 would be particularly useful in PLCR. 

Briefly, the thermostable DNA polymerases of the 
present invention which have been mutated to have 
reduced, attenuated or eliminated 5' to 3' exonuclease 
activity are useful for the same procedures and 
25 techniques as their respective non-mutated polymerases 
except for procedures and techniques which require 5' 
to 3' exonuclease activity such as the homogeneous 
assay technique discussed below. Moreover, the mutated 
DNA polymerases of the present invention will 
3 0 oftentimes result in more efficient performance of the 
procedures and techniques due to the reduction or 
elimination of the inherent 5' to 3' exonuclease 
activity. 

Specific thermostable DNA polymerases with 
35 attenuated 5' to 3' exonuclease activity include the 
following mutated forms of 2ag, Tma, Tspsi? . T205 . Tth 



wo 92/06200 



PCT/US91/07035 



-59- 



and Taf ONA polymerases. in the 
throughout the specification, deletion 
inclusive of the numbered nucleotides 
which define the deletion. 



below, and 
mutations are 
or amino acids 



DNA 

Polymerase 



10 



15 



20 



25 



30 



35 



40 



45 



50 



Mutation 

G(137) to A in nucleotide 
SED ID N0:1 

Gly (46) to Asp in amino 
acid SEQ ID NO: 2 

Deletion of nucleotides 
4-228 of nucleotide 
SEQ ID NO:l 

Deletion of amino acids 
2-76 of amino acid 
SEQ ID NO: 2 

Delection of nucleotides 
4-138 of nucleotide 
SEQ ID NO:l 

Deletion of amino acids 
2-46 of amino acid 
SEQ ID NO: 2 

Deletion of nucleotides 
4-462 of nucleotide 

SEQ ID NO:l 

Deletion of amino acids 
2-154 of amino acid 

SEQ ID NO: 2 

Deletion of nucleotides 
4-606 of nucleotide 
SEQ ID NO:l 

Deletion of amino acids 
2-202 of amino acid 
SEQ ID NO: 2 

Deletion of nucleotides 
4-867 of nucleotide 
SEQ ID NO:l 



Mutant 
Designation 

PRDA3-2 



ASP46 Tag 



pTAQd2-76 



MET-ALA 77 
Tag 



pTAQd2-46 



MET-PHE 47 
Tag 

pTAQd2-155 



MET-VAL 155 

Tag 

pTAQd2-202 



MET-THR 203 
Tag 



pliSGS 
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lo 



15 



20 



25 



30 



35 



40 



45 



50 



TSPS17 



Deletion of 2unino 
2-289 of amino acid 
SEQ ID NO:2 



G(llO) to A in nucleotide 
SEQ ID NO: 3 

Gly (37) to Asp in amino 
acid SEQ ID NO: 4 

Deletion of nucleotides 
4-131 of nucleotide 
SEQ ID NO: 3 

Deletion of amino acids 
2-37 of amino acid 
SEQ ID NO: 4 

Deletion of nucleotides 
4-60 of nucleotide 
SEQ ID NO: 3 

Deletion of amino acids 
2-20 of amino acid 
SEQ ID NO: 4 

Deletion of nucleotides 
4-219 of nucleotide 
SEQ ID NO: 3 

Deletion of amino acids 
2-73 amino acid 

SEQ ID NO: 4 

Deletion of nucleotides 
1-417 of nucleotide 
SEQ ID NO: 3 

Deletion of cimino acids 
1-139 of amino acid 
SEQ ID NO: 4 

Deletion of nucleotides 
1-849 of nucleotide 
SEQ ID NO: 3 

Deletion of amino acids 
1-283 of amino acid 
SEQ ID NO: 4 

G(128) to A in nucleotide 
SEQ ID NO: 5 



MET-SER 290 



pTMAd2-37 



MET-VAL 38 



pTMAd2-20 



MET-ASP 21 
Tma 



pTMAd2-73 



MET-GLU 74 
Tma 



pTMAie 



MET 140 
Tma 



pTMA15 



MET 284 
Tma 
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10 



15 



20 



25 



30 



35 



40 



TZ05 



45 



50 



Gly (43) to Asp in amino 
acid SEQ ID NO: 6 

Deletion of nucleotides 
4-129 of nucleotide 
SEQ ID NO: 5 

Deletion of amino acids 
2-43 of amino acid 

SEQ ID NO: 6 

Deletion of nucleotides 
4-219 of nucleotide 
SEQ ID NO: 5 

Deletion of amino acids 
2-73 of amino acid 
SEQ ID NO: 6 

Deletion of nucleotides 
4-453 of nucleotide 
SEQ ID NO: 5 

Deletion of amino acids 
2-151 of amino acid 
SEQ ID NO: 6 

Deletion of nucleotides 
4-597 of nucleotide 
SEQ ID NO: 5 

Deletion of amino acids 
2-199 of suaino acid 
SEQ ID NO: 6 

Deletion of nucleotides 
4-861 of nucleotide 
SEQ ID NO: 5 

Deletion of amino acids 
2-287 of amino acid 
SEQ ID NO: 6 

G(137) to A in nucleotide 
SEQ ID NO: 7 

Gly (46) to Asp in amino 
acid SEQ ID NO: 8 

Deletion of nucleotides 
4-138 of nucleotide 
SEQ ID NO: 7 



ASP43 
TSPS17 

pSPSd2-43 



MET-PHE 44 
TSPS17 

pSPSd2-73 



MET-ALA 74 



pSPSd2-151 



MET-LEU 152 
Tspsl7 



pSPSd2-199 



MET-THR 200 
Tspsl7 



PSPSA288 



MET-ALA 288 
Tsps 17 



ASP46 T205 



pZ05d2-46 
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10 



15 



20 



25 



30 



35 



Tth 



40 



45 



50 



Deletion of amino acids 
2-46 of amino acid 
SEQ ID NO: 8 

Deletion of nucleotides 
4-231 of nucleotide 
SEQ ID NO: 7 

Deletion of amino acids 
2-77 of amino acid 
SEQ ID NO: 8 

Deletion of nucleotides 
4-475 of nucleotide 
SEQ ID NO: 7 

Deletion of amino acids 
2-155 of amino acid 
SEQ ID NO: 8 

Deletion of nucleotides 
4-609 of nucleotide 
SEQ ID NO: 7 

Deletion of amino acids 
2-203 of amino acid 
SEQ ID NO: 8 

Deletion of nucleotides 
4-873 of nucleotide 
SEQ ID NO: 7 

Deletion of amino acids 
2-291 of amino acid 
SEQ ID NO: 8 

G(137) to A in nucleotide 
SEQ ID NO: 9 

Gly (46) to Asp in amino 
acid SEQ ID NO: 10 

Deletion of nucleotides 
4-138 of nucleotide 
SEQ ID NO: 9 

Deletion of amino acids 
2-46 of amino acid 
SEQ ID NO: 10 

Deletion of nucleotides 
4-231 of nucleotide 
SEQ ID NO: 9 



MET-PHE 47 
TZ05 



p205d2-77 



MET-AIA 78 
TZ05 



pZ05d2-155 



MET-VAL 156 
TZ05 



pZ05d2-203 



MET-THR 204 
TZ05 



PZ05A292 



MET-ALA 292 
TZ05 



ASP46 Tth 



pTTHd2-46 



MET-PHE 47 
Tth 



pTTHd2-77 
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10 



15 



20 



25 



30 



2af 



35 



40 



45 



50 



Deletion of eunlno acids 
2-77 of amino acid 
SEQ ID NO: 10 

Deletion of nucleotides 
4-465 of nucleotide 

SEQ ID NO: 9 

Deletion of amino acids 
2-155 of euaino acid 
SEQ ID NO: 10 

Deletion of nucleotides 
4-609 of nucleotide 
SEQ ID NO: 9 

Deletion of euaino acids 
2-203 of amino acid 
SEQ ID NO: 10 

Deletion of nucleotides 
4-873 of nucleotide 
SEQ ID NO: 9 

Deletion of amino acids 
2-291 of amino acid 
SEQ ID NO: 10 

G(llO) to A and A(lll) 
to T in nucleotide 
SEQ ID NO: 11 

Gly (37) to Asp in amino 
acid SEQ ID NO: 12 

Deletion of nucleotides 
4-111 of nucleotide 
SEQ ID NO: 11 

Deletion of amino acids 
2-37 of amino acid 
SEQ ID NO: 12 

Deletion of nucleotides 
4-279 of nucleotide 
SEQ ID NO: 11 

Deletion of amino acids 
2-93 amino acid 
SEQ ID NO: 12 



MET-ALA 78 



pTTHd2-155 



MET-VAL 156 



pTTHd2-203 



MET-THR 204 
Tth 



PTTHA292 



HET-AIiA 292 
Tth 



ASP37 Taf 



pTAFd2-37 



MET-LEU 38 
Taf 



pTAF09 



MET-TYR 94 
Taf 
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10 



Deletion of nucleotides 
4-417 of nucleotide 
SEQ ID NO: 11 

Deletion of amino acids 
2-139 of eunino acid 
SEQ ID NO: 12 

Deletion of nucleotides 
4-609 of nucleotide 
SEQ ID NO: 11 



pTAFll 



MET-GLU 140 
Taf 



pTAPd2-203 



15 



20 



25 



of cunino 
2-203 of amino acid 
SEQ ID NO: 12 

Deletion of nucleotides 
4-852 of nucleotide 
SEQ ID NO: 11 

Deletion of amino acids 
2-284 of su&ino acid 
SEQ ID NO: 12 



Thermostable DNA Polymerases With Enhanced 
5^ to 3 ^ Exonuclease Activity 



MET-THR 204 



PTAFI285 



MET-IIiE 285 
Taf 



Another aspect of the present invention involves 
3 0 the generation of thermostable DNA polymerases which 
exhibit enhanced or increased 5' to 3' exonuclease 
activity over that of their respective native 
polymerases* The thermostable DNA polymerases of the 
present invention which have increased or enhanced 5' 
35 to 3' exonuclease activity are particularly useful in 
the homogeneous assay system described in PCT 
application No. 91/05571 filed August 6, 1991, which is 
Incorporated herein by reference. Briefly, this system 
is a process for the detection of a target amino acid 
40 sequence in a sample comprising: 



(a) contacting a sample comprising single-stranded 
nucleic acids with an oligonucleotide containing a 
sequence complementary to a region of the target 
45 nucleic acid and a labeled oligonucleotide containing a 
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sequence complenen'tary 1:o a second region of "the Scune 
target nucleic acid strand, but not Including the 
nucleic acid sequence defined by the first 
oligonucleotide, to create a mixture of duplexes during 
5 hybridization conditions, wherein the duplexes comprise 
the target nucleic acid annealed to the first 
oligonucleotide and to the labeled oligonucleotide such 
that the 3' end of the first oligonucleotide is 
adjacent to the 5' end of the labeled oligonucleotide; 

10 (b) maintaining the mixture of step (a) with a 

template-dependent nucleic acid polymerase having a 5' 
to 3' nuclease activity under conditions sufficient to 
permit the 5' to 3' nuclease activity of the polymerase 
to cleave the annealed, labeled oligonucleotide and 

15 release labeled fragments; and 

(c) detecting and/ or measuring the release of 
labeled fragments. 

This homogeneous assay system is one which 
20 generates signal while the target sequence is 
amplified, thus, minimizing the post-amplification 
handling of thei amplified product which is common to 
other assay systems. Furthermore, a particularly 
preferred use of the thermostable DNA polymerases with 
25 increased 5' to 3' exonuclease activity is in a 
homogeneous assay system which utilizes PGR 
technology. This particular assay system involves: 

(a) providing to a PGR assay containing said 
30 sample, at least one labeled oligonucleotide containing 
a sequence complementary to a region of the target 
nucleic acid, wherein said labeled oligonucleotide 
anneals within the target nucleic acid sequence bounded 
by the oligonucleotide primers of step (b) ; 
35 (b) providing a set of oligonucleotide primers, 

wherein a first primer contains a sequence 
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complementary to a region in one strand of the target 
nucleic acid sequence and primes the synthesis of a 
complementary DNA strand, and a second primer contains 
a sequence complementary to a region in a second strand 
5 of the target nucleic acid sequence and primes the 
synthesis of a complementary DNA strand; and wherein 
each oligonucleotide primer is selected to anneal to 
its complementary template upstream of any labeled 
oligonucleotide annealed to the same nucleic acid 
10 strand; 

(c) amplifying the target nucleic acid sequence 
employing a nucleic acid polymerase having 5' to 3' 
nuclease activity as a template-dependent polymerizing 
agent under conditions which are permissive for PCR 

15 cycling steps of (i) annealing of primers and labeled 
oligonucleotide to a template nucleic acid sequence 
contained within the target region, and (ii) extending 
the primer, wherein said nucleic acid polymerase 
synthesizes a primer extension product while the 5' to 

20 3' nuclease activity of the nucleic acid polymerase 
simultaneously releases labeled fragments from the 
annealed duplexes comprising labeled oligonucleotide 
and its complementary template nucleic acid sequences, 
thereby creating detectable labeled fragments; and 
25 (d) detecting and/or measuring the release of 

labeled fragments to determine the presence or absence 
of target sequence in the sample. 

The increased 5' to 3' exonuclease activity of the 
30 thermostable DNA polymerases of the present invention 
when used in the homogeneous assay systems causes the 
cleavage of mononucleotides or small oligonucleotides 
from an oligonucleotide annealed to its larger, 
complementary polynucleotide. m order for cleavage to 
35 occur efficiently, an upstream oligonucleotide must 
also be annealed to the same larger polynucleotide. 



\%'0 92/06200 



PCT/US91/07035 



-67- 

The 3^ end of t:his upst:reaiii ollgonucleot:lde 
provides t:he inl'tlal binding site for 1:he nucleic acid 
polymerase. As soon as the bound polymerase encounters 
the 5' end of the downstream oligonucleotide, the 
5 polymerase can cleave mononucleotides or small 
oligonucleotides therefrom. 

The two oligonucleotides can be designed such that 
they anneal in close proximity on the complementary 
target nucleic acid such that binding of the nucleic 
10 acid polymerase to the 3' end of the upstream 
oligonucleotide automatically puts it in contact with 
the 5' end of the downstream oligonucleotide. This 
process, because polymerization is not required to 
bring the nucleic acid polymerase into position to 
15 accomplish the cleavage, is called "polymerization- 
independent cleavage" • 

Alternatively, if the two oligonucleotides anneal 
to more distantly spaced regions of the template 
nucleic acid target, polymerization must occur before 
20 the nucleic acid polymerase encounters the 5' end of 

the downstream oligonucleotide. As the polymerization 
continues, the polymerase progressively cleaves 
mononucleotides or small oligonucleotides from the 5' 
end of the downstream oligonucleotide. This cleaving 

25 continues until the remainder of the downstream 
oligonucleotide has been destabilized to the extent 
that it dissociates from the template molecule^ This 
process is called "polymerization-dependent cleavage". 

The attachment of label to the downstream 

30 oligonucleotide permits the detection of the cleaved 
mononucleotides and small oligonucleotides. 

Subsequently, any of several strategies may be employed 
to distinguish the uncleaved labelled oligonucleotide 
from the cleaved fragments thereof. In this manner, 

35 nucleic acid samples which contain sequences 
complementary to the upstream and downstream 
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can be 

differently, a labelled oligonucleotide is added 
concomittantly with the primer at the start of PCR, and 
the signal generated from hydrolysis of the labelled 
5 nucleotide (s) of the probe provides a means for 
detection of the target sequence during its 
ampl if icat ion . 

In the homogeneous assay system process, a seunple 
is provided which is suspected of containing the 

10 particular oligonucleotide sequence of interest, the 
"target nucleic acid". The target nucleic acid 
contained in the seunple may be first reverse 
transcribed into cDNA, if necessary, and then 
denatured, using emy suitable denaturing method, 

15 including physical, chemical, or enzymatic means, which 
are known to those of skill in the art. A preferred 
physical means for strand separation involves heating 
the nucleic acid until it is completely (>99%) 
denatured. Typical heat denaturation involves 

20 temperatures ranging from eOjout 80 'C to eU3out 105 "C, 
times ranging from a few seconds to minutes. As an 
alternative to denaturation, the target nucleic acid 
may exist in a single-stranded form in the sample, such 
as, for example, single-stranded RNA or DNA viruses. 

25 The denatured nucleic acid strands are then 

incubated with preselected oligonucleotide primers and 
labeled oligonucleotide (also referred to herein as 
"probe") under hybridization conditions, conditions 
which enable the binding of the primers and probes to 

30 the single nucleic acid strands. As known in the art, 
the primers are selected so that their relative 
positions along a duplex sequence are such that an 
extension product synthesized from one primer, when the 
extension product is separated from its template 
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(complement:) , serves as a template for the extension of 
the other primer to yield a replicate chain of defined 
length. 

Because the complementary strands are longer than 
5 either the probe or primer, the strands have more 
points of contact and thus a greater chance of finding 
each other over any given period of time. A high molar 
excess of probe, plus the primer, helps tip the balance 
toward primer and probe annealing rather than template 

10 reannealing. 

The primer must be sufficiently long to prime the 
synthesis of extension products in the presence of the 
agent for polymerization. The exact length and 
composition of the primer will depend on many factors, 

15 including temperature of the annealing reaction, source 
and composition of the primer, proximity of the probe 
annealing site to the primer annealing site, and ratio 
of primer: probe concentration. For example, depending 
on the complexity of the target sequence, the 

20 oligonucleotide primer typically contains about 15-3 0 
nucleotides, although a primer may contain more or 
fewer nucleotides. The primers must be sufficiently 
complementary to anneal to their respective strands 
selectively and form stable duplexes. 

25 The primers used herein are selected to be 

"substantially" complementary to the different strands 
of each specific sec[uence to be amplified. The primers 
need not reflect the exact sequence of the template, 
but must be sufficiently complementary to hybridize 

3 0 selectively to their respective strands. 

Non-complementary bases or longer sequences can be 
interspersed into the primer or located at the ends of 
the primer, provided the primer retains sufficient 
complementarity with a template strand to form a stable 
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duplex therewith • The non-complementary nucleotide 
sequences of the primers may include restriction enzyme 



In the practice of the homogeneous assay system , 
5 the labeled oligonucleotide probe must be first 
annealed to a complementary nucleic acid before the 
nucleic acid polymerase encounters this duplex region^ 
thereby permitting the 5' to 3' exonuclease activity to 
cleave and release labeled oligonucleotide fragments* 

10 To enhance the likelihood that the labeled 

oligonucleotide will have annealed to a complementary 
nucleic acid before primer extension polymerization 
reaches this duplex region, or before the polymerase 
attaches to the upstream oligonucleotide in the 

15 polymerization- independent process, a variety of 
techniques may be employed. For the polymerization- 
dependent process, one can position the probe so that 
the 5 '-end of the probe is relatively far from the 
3 '-end of the primer, thereby giving the probe more 

20 time to anneal before primer extension blocks the probe 
binding site. Short primer molecules generally require 
lower temperatures to form sufficiently stable hybrid 
complexes with the target nucleic acid. Therefore, the 
labeled oligonucleotide can be designed to be longer 

25 than the primer so that the labeled oligonucleotide 
anneals preferentially to the target at higher 
temperatures relative to primer annealing. 

One can also use primers ' and labeled 
oligonucleotides having differential thermal 

30 stability. For example, the nucleotide composition of 
the labeled oligonucleotide can be chosen to have 
greater G/C content and, consequently, greater thermal 
stability than the primer. In similar fashion, one can 
incorporate modified nucleotides into the probe, which 
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modified nucleo'tides contain base analogs t:hat form 
more s'table base pairs t:han t:he bases tiiat are 
typically present in naturally occurring nucleic acids. 
Modifications of the probe that may facilitate 
5 probe binding prior to primer binding to maximize the 
efficiency of the present assay include the 
incorporation of positively charged or neutral 
phosphodiester linkages in the probe to decrease the 
repulsion of the polyanionic backbones of the probe and 
10 target (see Letsinger et al. , 1988, J. Amer . Chem , Soc. 
110 ;4470) ; the incorporation of alkylated or 
halogenated bases, such as 5-bromouridine, in the probe 
to increase base stacking; the incorporation of 
ribonucleotides into the probe to force the 
15 probe: target duplex into an "A" structure, which has 
increased base stacking; and the siibstitution of 
2 , 6-diaminopurine (amino adenosine) for some or all of 
the adenosines in the probe. In preparing such 
modified probes of the invention, one should recognize 
20 that the rate limiting step of duplex formation is 
Enucleation", the formation of a single base pair, and 
therefore, altering the biophysical characteristic of a 
portion of the probe, for instance, only the 3' or 5' 
terminal portion, can suffice to achieve the desired 
25 result. In addition, because the 3' terminal portion 
of the probe (the 3' terminal 8 to 12 nucleotides) 
dissociates following exonuclease degradation of the 5' 
terminus by the polymerase, modifications of the 3' 
terminus can be made without concern about interference 
30 with polymerase/ nuclease activity. 

The thermocycling parameters can also be varied to 
take advantage of the differential thermal stcibility of 
the labeled oligonucleotide and primer. For example, 
following the denaturation step in thermocycling, an 
35 intermediate temperature may be introduced which is 
permissible for labeled oligonucleotide binding but not 
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prlner binding, emd then the temperature is further 
reduced to permit primer emnealing and extension. One 
should note, however, that probe cleavage need only 
occur in later cycles of the PCR process for suitable 
5 results. Thus, one could set up the reaction mixture 
so that even though primers initially bind 
preferentially to probes, primer concentration is 
reduced through primer extension so that, in later 
cycles, probes bind preferentially to primers. 

10 To favor binding of the labeled oligonucleotide 
before the primer, a high molar excess of labeled 
oligonucleotide to primer concentration can also be 
used. In this embodiment, labeled oligonucleotide 
concentrations are typically in the range of about 2 to 

15 20 times higher than the respective primer 
concentration, which is generally 0.5 - 5 x 10"'' M. 
Those of skill recognize that oligonucleotide 
concentration, length, and base composition are each 
important factors that affect the Tju of any particular 

20 oligonucleotide in a reaction mixture. Each of these 
factors can be manipulated to create a thermodynamic 
bias to favor probe annealing over primer annealing. 

Of course, the homogeneous assay system can be 
applied to systems that do not involve amplification. 

25 In fact, the present invention does not even require 
that polymerization occur. One advantage of the 
polymerization-independent process lies in the 
elimination of the need for amplification of the target 
sequence. In the absence of primer extension, the 

3 0 target nucleic acid is substantially single-stranded. 
Provided the primer and labeled oligonucleotide are 
adjacently bound to the target nucleic acid, sequential 
rounds of oligonucleotide annealing and cleavage of 
labeled fragments can occur. Thus, a sufficient eimount 
35 of labeled fragments can be generated, making detection 
possible in the absence of polymerization. As would be 
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apprecia'ted by 1:hose skilled in the art, tlie signal 
generated during PGR amplification could be augmented 
by this polymerization- independent activity. 

In addition to the homogeneous assay systems 
5 described above, the thermostable DNA polymerases of 
the - present invention with enhanced 5' to 3' 
exonuclease activity are also useful in other 
amplification systems, such as the transcription 
amplification system, in which one of the PGR primers 

10 encodes a promoter that is used to make RNA copies of 
the target sequence. In similar fashion, the present 
invention can be used in a self-sustained sequence 
replication (3SR) system, in which a variety of enzymes 
are used to make KNA transcripts that are then used to 

15 make DNA copies, all at a single temperature. By 
incorporating a polymerase with 5' to 3' exonuclease 
activity into a ligase chain reaction (LCR) system, 
together with appropriate oligonucleotides, one can 
also employ the present invention to detect LCR 

20 products. 

Also, just as 5' to 3' exonuclease deficient 
thermostable DNA polymerases are useful in PLCR, other 
thermostable DNA polymerases which have 5' to 3' 
exonuclease activity are also useful in PLCR under 

25 different circumstances. Such is the case when the 5' 
tail of the downstream primer in PLCR is 
non-complementary to the target DNA. Such 
non-complementarity causes a forked structure where the 
5' end of the upstream primer would normally anneal to 

30 the target DNA. 

Thermostable ligases cannot act on such forked 
structures. However, the presence of 5' to 3' 
exonuclease activity in the thermostable DNA polymerase 
will cause the excision of the forked 5' tail of the 

35 upstream primer,, thus permitting the ligase to act. 
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The 



same processes and techniques which are 



DNA polymerases with attenuated 5' to 3' exonuclease 
activity are also effective for preparing the 
5 thermostable DNA polymerases with enhanced 5' to 3' 
exonuclease activity. As described above, these 
processes include such techniques as site-directed 
mutagenesis, deletion mutagenesis and "domain 
shuffling" . 

^0 Of particular usefulness in preparing the 

thermostable DNA polymerases with enhanced 5' to 3' 
exonuclease activity is the "domain shuffling" 
technique described eU^ove, To briefly summarize, this 
technique involves the cleavage of a specific domain of 

15 a polymerase which is recognized as coding for a very 
active 5' to 3' exonuclease activity of that 
polymerase, and then transferring that domain into the 
appropriate area of • a second thermostable DNA 
polymerase gene which encodes a lower level or no 5' to 

20 3' exonuclease activity. The desired domain may 
replace a domain which encodes an undesired property of 
the second thermostable DNA polymerase or be added to 
the nucleotide sequence of the second thermostable DNA 
polymerase. 

25 A particular "domain shuffling" example is set 

forth above in which the Tma DNA polymerase coding 
sequence comprising codons about 291 through 484 is 
substituted for the Tag DNA polymerase I codons 289 
through 422. This substitution yields a novel 

30 thermostable DNA polymerase containing the 5' to 3' 
exonuclease domain of Tag DNA polymerase (codons 
1-289), the 3' to 5' exonuclease domain of Tma DNA 
polymerase (codons 291-484) and the DNA polymerase 
domain of Tag DNA polymerase (codons 423-832) . 

3 5 However, those skilled in the art will recognize that 
other substitutions can be made in order to construct a 
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themostable DNA polymerase with certain desired 
characteristics such as enhanced 5' to 3' exonuclease 
activity . 

The following exeunples are offered by way of 
5 illustration only and are by no means intended to limit 
the scope of the claimed invention. In these exsunples, 
all percentages are by weight if for solids and by 
volume if for liquids, unless otherwise noted, and all 
temperatures are given in degrees Celsius. 

10 

Example 1 

Preparation of a 5' to 3' Exonuclease Mutant 
of Tftq DNA Polymerase by Random Mutagenesis 
15 PGR of the Known 5^ to 3 ^ Exonuclease Domain 

Preparation of Insert 

Plasmid pIiS612 was used as a template for PGR. 

20 This plasmid is a Hindlll minus version of pLSG5 in 
which the Tag polymerase gene nucleotides 616 - 621 of 
SEQ ID NO:l were changed from AAGCTT to AAGCTG. This 
change eliminated the flindlll recognition sequence 
within the Tag polymerase gene without altering encoded 

25 protein sequence. 

Using oligonucleotides MK61 (AGGACTACAACTGCCACACACC) 

(SEQ ID NO: 21) and RAOl (CGAGGCGCGCCAGCCCCAGGAGATCTACC- 

AGCTCCTTG) (SEQ ID NO: 22) as primers and pLSG12 as the 
template, PGR was conducted to amplify a 384 bp 

30 fragment containing the ATG start of the Tag polymerase 
gene, as well as an additional 331 bp of coding 
sequence downstream of the ATG start codon. 

A 100 }il PGR was conducted for 25 cycles utilizing 
the following amounts of the following agents and 

35 reactants: 
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50 pmol of primer MK61 (SEQ ID NO: 21); 
50 pmol of primer RAOl (SEQ id NO: 22); 
50 pK of each dNTP; 
10 mM Tris-HCl, pH 8.3; 
5 50 mM KCl; 

1.5 mM Mgci2; 
75.6 pg pLSG12; 

2.5 units AmpliTaq DNA polymerase. 



10 The PGR reaction mixture described was placed in a 

Perkin-Elmer Cetus Thermocycler and run through the 
following profile. The reaction mixture was first 
ramped up to 98 •€ over 1 minute and 45 seconds, and 
held at 98 'C for 25 seconds. The reaction mixture was 

15 then ramped down to 55 'C over 45 seconds and held at 
that temperature for 20 seconds. Finally, the mixture 
was ramped up to 72 *€ over 45 seconds, and held at 72 
for 30 seconds. A final 5 minute extension occurred at 
72*C. 

20 The PGR product was then extracted with chloroform 

and precipitated with isopropanol using techniques 
which are well known in the art. 

A 3 00 ng sample of the PGR product was digested 
with 20 U of IJindlll (in 30 ]il reaction) for 2 hours at 

25 37 -c. Then, an additional digestion was made with 8 U 
of BssHII for an 2 hours at 50 'C. This series of 
digestions yielded a 330 bp fragment for cloning. 

A vector was prepared by digesting 5.3 ng of pLSG12 
with 20 U Hindlll (in 40 nl) for 2 hours at 37-C. This 

3 0 digestion was followed by addition of 12 U of Bss HII 
and incubation for 2 hours at 50 'C. 

The vector was dephosphorylated by treatment with 
CIAP (calf intestinal alkaline phosphatase) , 
specifically 0.04 U CIAP for 30 minutes at 30'C. Then, 
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4 ^1 of 500 mM E6TA was added to the vector preparation 
to stop the reaction, and the phosphatase was 
Inactivated by Incubation at 65 *C for 45 minutes. 

225 ng of the phosphatased vector described above 
5 was llgated at a 1:1 molar ratio with 10 ng of the 
PCR-derlved Insert. 

Then, DG116 cells were transformed with one fifth 
of the ligation mixture, and cunplclllln-reslstant 
transformants were selected at 30 *C. 
10 Appropriate colonies were grown overnight at 30 *C 

to OD500 0,7. Cells containing the Pj^ vectors were 
Induced at 37 "C In a shaking water bath for 4, 9, or 20 
hours, and the preparations were sonicated and heat 

treated at 75 *C In the presence of 0.2 M euomonlvim 
15 sulfate. Finally, the extracts were assayed for 
polymerase activity and 5' to 3' exonuclease activity. 

The 5' to 3' exonuclease activity was quantified 
utilizing the 5' to 3' exonuclease assay described 
above. Specifically, the synthetic 3' phosphorylated 
20 oligonucleotide probe (phosphorylated to preclude 
polymerase extension) BW33 (GATCGCT6CGCGTAACCACCA- 
CACCCGCCGCGCp) (SEQ ID NO: 13) (100 pmol) was 
^^P-labeled at the 5' end with gamma- [32p] atp (3000 
Ci/mmol) and T4 polynucleotide kinase. The reaction 
25 mixture was extracted with phenol: chloroform risoamyl 
alcohol, followed by ethanol precipitation. The 
•'^P-labeled oligonucleotide probe was redissolved in 
100 ]il of TE buffer, and unincorporated ATP was removed 
by gel filtration chromatography on a Sephadex G-50 
30 spin column. Five pmol of -^^P-labeled BW33 probe, was 
annealed to 5 pmol of single-strand M13mpl0w DNA, in 
the presence of 5 pmol of the synthetic oligonucleotide 
primer BW37 (GCGCTAGGGCGCTGGCAAGTGTAGCGGTCA) (SEQ ID 

NO: 14) in a 100 pi reaction containing 10 mM Tris-HCl 
35 (pH 8.3), 50 mM KCl, and 3 mM MgCl2 . The annealing 
mixture was heated to 95 "C for 5 minutes, cooled to 
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70 'C over 10 minutes, incubated at 70 'C for an 
additional 10 minutes, and then cooled to 25 over a 
30 minute period in a Perkin-Elmer Cetus DNA thermal 
cycler. Exonuclease reactions containing lo jil of the 
5 annealing mixture were pre- incubated at 70 'C for 1 
minute. The thermostable DNA polymerase preparations 
of the invention (approximately 0.3 U of enzyme 
activity) were added in a 2.5 ^l volume to the 
pre-incubation reaction, and the reaction mixture was 

10 incubated at 70 'c. Aliquots (5 ^l) were removed after 
1 minute and 5 minutes, emd stopped by the addition of 
1 nl of 60 mM EDTA. The reaction products were 
analyzed by homochromatography and exonuclease activity 
was quantified following autoradiography. 

15 Chromatography was carried out in a homochromatography 
mix containing 2% partially hydrolyzed yeast RNA in 7M 
urea on Polygram CEL 300 DEAE cellulose thin layer 
chromatography plates. The presence of 5' to 3' 
exonuclease activity resulted in the generation of 

20 small 32p_iaj3^jLed oligomers, which migrated up the TLC 
plate, and were easily differentiated on the 
autoradiogram from undegraded probe, which remained at 
the origin. 

The clone 3-2 had an expected level of polymerase 
25 activity but barely detectable 5' to 3' exonuclease 
activity. This represented a greater than 1000-fold 
reduction in 5' to 3' exonuclease activity from that 
present in native Tag DNA polymerase. 

This clone was then sequenced and it was found that 
30 G (137) was mutated to an A in the DNA sequence. This 
mutation results in a Gly (46) to Asp mutation in the 
amino acid sequence of the Tag DNA polymerase, thus 
yielding a thermostable DNA polymerase of the present 
invention with significantly attenuated 5' to 3' 
35 exonuclease activity. 
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The recovered protein was purified according to the 
Tag DNA polymerase protocol which is taught in Serial 
No, 523,394 filed May 15, 1990, incorporated herein by 
reference. 

5 

Example 2 

Construction of Met 289 (A289) 544 
Amino Acid Form of Tag Polymerase 

10 

As indicated in Exconple 9 of U.S. Serial No. 
523,394, filed May 15, 1990, during a purification of 
native Tag polymerase an altered form of Tag polymerase 
was obtained that catalyzed the template dependent 
15 incorporation of dNTP at 70*C. This altered form of 
Tag polymerase was immunologically related to the 
approximate 90 kd form of purified native Tag 
polymerase but was of lower molecular weight. Based on 
mobility, relative to BSA and ovalbumin following 
20 SDS-PAGE electrophoresis, the apparent molecular weight 
of this form is approximately 61 kd. This altered form 
of the enzyme is not present in carefully prepared 
crude extracts of Thermus aguaticus cells as determined 
by SDS-PAGE Western blot analysis or in situ DNA 
25 polymerase activity determination (Spanos, A., and 
Hubscher, U. (1983) Meth. Enz, 91:263-277) following 
SDS-PAGE gel electrophoresis. This form appears to be 
a proteolytic artifact that may arise during sample 
handling. This lower molecular weight form was 
30 purified to homogeneity and subjected to N-terminal 
sequence determination on an ABI automated gas phase 
sequencer. Comparison of the obtained N-terminal 
sequence with the predicted amino acid sequence of the 
Taq polymerase gene (SEQ ID N0:1) indicates this 
35 shorter form arose as a result of proteolytic cleavage 
between Glu(289) and Ser(290) . 
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To Ob-tain a further truncated form of a Tag 
polymerase gene that would direct the synthesis of a 
544 amino acid primary translation production plasmids 
pPC54.t, PSYC1578 and the complementary synthetic 
5 oligonucleotides DG29 (5'-AGCTTATGTCTCCAAAAGCT) (SEQ ID 
HO: 23) and DG30 ( 5 ' -AGCTTTTGGAGACATA) (SEQ ID NO: 24) 

were used. Plasmid pFC54.t was digested to completion 
with fiindlll and £as^l. Plasmid pSYC1578 was digested 
with £s£XI (at nucleotides 872 to 883 of SEQ ID NO:l) 
10 and treated with coli dna polymerase I Klenow 

fragment in the presence of all 4 dNTPs to remove the 4 
nucleotide 3' cohesive end and generate a 
CTG-terminated duplex blunt end encoding I<eu294 in the 
Sag polymerase sequence (see Tag polymerase SEQ ID NO:l 

15 nucleotides 880-882) . The DNA sample was digested to 
completion with Sglli and the approximate 1.6 Jcb Bstx i 
(repaired) /Bglll Tag DNA fragment was purified by 
agarose gel electrophoresis and electroelution . The 
pFC54.t plasmid digest (0.1 pmole) was ligated with the 

20 2aa polymerase gene fragment (0.3 pmole) and annealed 
nonphosphorylated DG29/DG30 duplex adaptor (0.5 pmole) 
under sticky ligase conditions at 30 pg/ml, 15 'C 
overnight. The DNA was diluted to approximately 10 
microgram per ml and ligation continued under blunt end 

25 conditions. The ligated DNA sample was digested with 
5^1 to linearize (inactivate) any lL-2 mutein-encoding 
ligation products. 80 nanograms of the ligated 2uid 
digested' DNA was used to transform £^ coli K12 strain 
DG116 to ampicillin resistance. Amp^ candidates were 

30 screened for the presence of an approximate 7.17 kb 
plasmid which yielded the expected digestion products 
with EcoRI (4,781 bp + 2,386 bp), Pst I (4,138 bp + 
3,029 bp), ^pal (7,167 bp) and Hin dlll/ Pst I (3,400 bp + 
3,029 bp + 738 bp). Ej. coli colonies harboring 

35 candidate plasmids were screened by single colony 
immunoblot for the temperature-inducible synthesis of 
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an approximate 61 kd Tag polymerase related 
polypeptide. In addition, candidate plasmids were 
subjected to DNA sequence determination at the 5' XPj^ 
promoter: Tag DNA junction and the 3' Tag DNA:BT crv PRE 
5 junction. One of the plasmids encoding the intended 
DNA seguence and directing the synthesis of a 
temperature-inducible 61 kd Tag polymerase related 
polypeptide was designated pIiS668. 

Expression of 61 kPa Tag Pol I . Cultures 
10 containing pLSGB were grown as taught in Serial No. 
523,364 and described in Excuaple 3 below. The 61 kDa 
Tag Pol I appears not to be degraded upon 
heat-induction at 41 'C. After 21 hours at 41 'C, a 
heat-treated crude extract from a culture harboring 
15 pLSGS had 12,310 units of heat-stable DNA polymerase 
activity per mg crude extract protein, a 24 -fold 
increase over an xininduced culture. A heat-treated 
extract from a 21 hour 3yc-induced pLSGB culture had 
9,503 units of activity per mg crude extract protein. 
20 A nine-fold increase in accumulated levels of Tag Pol I 
was observed between a 5 hour and 21 hour induction at 
37 'C and a nearly four-fold increase between a 5 hour 
and 21 hour induction at 41 'C. The same total protein 
and heat-treated extracts were analyzed by SDS-PAGE. 
25 20 ng crude extract protein or heat-treated crude 
extract from 20 ]xg crude extract protein were applied 
to each lane of the gel. The major bands readily 
apparent in both the 17*0 and 41 "C, 21 hour-induced 
total protein lanes are egually intense as their 
30 heat-treated counterparts. Heat-treated crude extracts 
from 20 tig of total protein from 37 'C and 41 'C, 21 hour 
samples contain 186 units and 243 units of thermostable 
DNA polymerase activity, respectively. To determine 
the usefulness of 61 kDa Tag DNA polymerase in PCR, PGR 
35 assays were performed using heat-treated crude extracts 
from induced cultures of pLSGS. Heat-treated crude 
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extract: from Induced cultxires of pLSGS were used as the 
source of full-length Tag Pol I in PGR. PGR product 
was observed in reactions utilizing 4 units and 2 units 
of truncated enzyme. There was more product in those 
5 PCRs than in anyof the full-length enzyme reactions. 
In addition, no non-specific higher molecular weight 
products were visible. 

Purification of 61 kPa Ta g Pol t . Purification of 
61 JcDa 3Cas Pol I from induced pLSG8/DG116 cells 

10 proceeded as the purification of full-length Tag Pol I 
as in Example 12 of U.S. Serial No. 523,394, filed 
May 15, 1990 with some modifications. 

Induced pLSG8/DG116 cells (15.6 g) were homogenized 
and lysed as described in U.S. Serial No. 523,394, 

15 filed May 15, 1990 and in Example 3 below. Fraction I 
contained 1.87 g protein and 1.047 x 10^ units of 
activity. Fraction II, obtained as a 0.2 M ammonium 
sulfate supemant contained 1.84 g protein emd 1.28 x 
10^ units of activity in 74 ml. 

20 Following heat treatment, Polymin P (pH 7.5) was 

added slowly to 0.7%. Following centrifugation, the 
supemant. Fraction III contained 155 mg protein and 
1.48 X 10^ units of activity. 

Fraction III was loaded onto a 1.15 x 3.1 cm (3.2 

25 ml) phenyl sepharose column at 10 ml/cm Vhour. All of 
the applied activity was retained on the column. The 
colximn was washed with 15 ml of the equilibration 
buffer and then 5 ml (1.5 column volumes) of O.IM KCl 
in TE. The polymerase activity was eluted with 2 M 

3 0 urea in TE containing 20% ethylene glycol. Fractions 
(0.5 ml each) with polymerase activity were pooled (8.5 
ml) and dialyzed into heparin sepharose buffer 
containing 0.1 M KCl. The dialyzed material. Fraction 
IV (12.5 ml), contained 5.63 mg of protein and 1.29 x 

3 5 10^ units of activity. 
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Fraction IV was loaded onto a 1.0 ml bed volume 
heparin sepharose column equilibrated as above. The 
column was washed with 6 ml of the same buffer (A280 
baseline) and eluted with a 15 ml linear 0.1-0.5 M KCl 
5 gradient in the same buffer. Fractions (0.15 ml) 
eluting between 0.16 and 0.27 M KCl were analyzed by 
SDS-PAGE. A minor (<1%) contaminating approximately 47 
kDa protein copurified with 61 kDa Tag Pol I. 
Fractions eluting between 0.165 and 0.255 M KCl were 
10 pooled (2.5 ml) and diafiltered on a Centricon 30 
membrane into 2.5X storage buffer. Fraction V 
contained 2.8 mg of protein and 1.033 x 10^ units of 61 
kDa Tag Pol I. 

PCR Using Purifi ed 61 kPa Tag Pol I . PCR reactions 
15 (50 jil) containing 0.5 ng lambda DNA, 10 pmol each of 
two lambda-specific primers, 200 pM each dNTPs, 10 mM 
Tris-Cl, pH 8.3, 3 mM MgClj, 10 mM KCl and 3.5 units of 
61 kDa Tag Pol I were performed. As a comparison, PCR 
reactions were performed with 1.25 units of full-length 
20 Tag Pol I, as above, with the substitution of 2 mM 
MgCl2 and 50 mM KCl. Thermocycling conditions were 1 
minute at 95 'C and 1 minute at 60 'C for 23 cycles, with 
a final 5 minute extension at 75 "C. The amount of DNA 
per reaction was quantitated by the Hoechst fluorescent 
25 dye assay. 1.11 pg of product was obtained with 61 kDa 
Tag Pol I (2.2 x 10^-fold amplification), as compared 
with 0.70 ng of DNA with full-length Tag Pol I (1.4 x 
lO^-fold amplification) . 

Thermostabilitv of 61 kPa Tag Pol l > Steady state 
3 0 thermal inactivation of recombinant 94 kDa Tag Pol I 
and 61 kDa Tag Pol I was performed 97.5*C under buffer 
conditions mimicking PCR. 94 kDa Tag Pol I has an 
apparent half-life of approximately 9 minute at 97. 5 "C, 
whereas the half-life of 61 kDa Tag Pol I was 
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approximately 21 minutes. The thermal inactivation of 
61 kDa laa Pol I was unaffected by KCl concentration 
over a range from 0 to 50 mM. 

Yet another truncated lag polymerase gene contained 
5 within the -2.68 icb HindIIl-^718 fragment of plasmid 
PFC85 can be expressed using, for example, 
P^lNrbsATG, by operably linking the amino-- 
Hindlll restriction site encoding the Tag poi gene to 
an ATG initiation codon. The product of this fusion 
10 upon expression will yield an -70,000-72,000 dalton 



This specific construction can be made by 
plasmid PFC85 with fiindlll and treating with Klenow 
fragment in the presence of dATP and dGTP. The 

15 resulting fragment is treated further with SI nuclease 
to remove any single-stranded extensions and the 
resulting DNA digested with ^718 and treated with 
Klenow fragment in the presence of all four dNTPs. The 
recovered fragment can be ligated using T4 DNA ligase 

20 to dephosphorylated plasmid pPlNrbsATG, which had been 
digested with Sasl and treated with Klenow fragment in 
the presence of dGTP to construct an ATG blunt end. 
This ligation mixture can then be used to transform fi^ 
CQii DG116 and the transformants screened for 

25 production of 2as polymerase. Expression can be 
confirmed by Western immunoblot analysis and activity 
analysis. 



30 



Example 3 

Construction, Expression and Purification 
of a Truncated 5' to 3' Exonuclease 
Deficient Tma Polvmerase rMFT5« 
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To express a 5' to 3' exonuclease deficient: Tma DNA 
polymerase lacking amino acids 1-283 of native Tma DNA 
polyi&erase the following steps were performed. 

Plasmid pTmal2-l was digested witli BspHI 
5 (nucleotide position 848} and Hin dlll (nucleotide 
position 2629) . A 1781 base pair fragment was isolated 
by agarose gel purification. To separate the agarose 
from the DNA, a gel slice containing the desired 
fragment was frozen at -20 *C in a Costar spinex filter 
10 unit. After thawing at room temperature, the unit was 
spun in a microfuge. The filtrate containing the DNA 
was concentrated in a Speed Vac concentrator, and the 
DNA was precipitated with ethanol. 

The isolated fragment was cloned into plasmid 
15 pTmal2-l digested with Nco l and Hin dlll. Because Nco l 
digestion leaves the sauna cohesive end sequence as 
digestion with BspHI, the 1781 base pair fragment has 
the same cohesive ends as the full length fragment 
excised from plasmid pTmal2-l by digestion with Nco l 
20 and Hin dlll. The ligation of the isolated fragment 
with the digested plasmid results in a fragment switch 
and was used to create a plasmid designated pTmal4 • 

Plasmid pTmalS was similarly constructed by cloning 
the same isolated fragment into pTmal3. As with 
25 pTmal4, pTmal5 drives expression of a polymerase that 
lacks amino acids 1 through 283 of native Tma DNA 
polymerase; translation initiates at the methionine 
codon at position 284 of the native coding sequence. 

Both the pTmal4 and pTmal5 expression plasmids 
30 expressed at a high level a biologically active 
thermostable DNA polymerase devoid of 5' to 3' 
exonuclease activity of molecular weight of about 70 
kDa; plasmid pTmalS expressed polymerase at a higher 
level than did pTmal4 . Based on similarities with E. 
35 coll Pol I Klenow fragment, such as conservation of 
amino acid sequence motifs in all three domains that 
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are critical for 3' to 5' exonuclease activity, 
distance from the amino terminus to the first domain 
critical for exonuclease activity, and length of the 
expressed protein, the shortened form (MET284) of Tma 
5 DNA polymerase exhibits 3' to 5' exonuclease or 
proof-reading activity but lacks 5' to 3' exonuclease 
activity. Initial SDS activity gel assays euid solution 
assays for 3' to 5' exonuclease activity suggest 
attenuation in the level of proof-reading activity of 

10 the polymerase expressed by coli host cells 

harboring plasmid pTmal5. 

MET284 Ti^a DNA polymerase was purified from coli 
strain DG116 containing plasmid pTmalS. The seed flask 
for a 10 L fermentation contained tryptone (20 g/1) , 

15 yeast extract (10 g/1) , NaCl (10 g/1) , glucose (10 
g/1) , ampicillin (50 mg/1) , and thiamine (10 mg/1) . The 
seed flask was innoculated with a colony from an agar 
plate (a frozen glycerol culture can be used) . The 
seed flask was grown at 30 to between 0.5 to 2.0 O.D. 

20 (A530) . The volume of seed culture inoculated into the 
fermentor is calculated such that the bacterial 
concentration is 0.5 mg dry weight/liter. The 10 liter 
growth medium contained 25 mM KH2PO4, 10 mM (^4)2804, 
4 mM sodium citrate, 0.4 mM FeCl3, ^-O^ ^ ZnCl2, ^-^^ 

25 mM C0CI2, 0.03 mM CUCI2, and 0.03 mM H3BO3. The 
following sterile components were added: 4 mM MgS04, 
20 g/1 glucose, 20 mg/1 thiamine, and 50 mg/1 
ampicillin. The pH was adjusted to 6.8 with NaOH and 
controlled during the fermentation by added NH4OH. 

30 Glucose was continually added by coupling to NH4OH 
addition. Foaming was controlled by the addition of 
propylene glycol as necessary, as an antif earning agent. 
Dissolved oxygen concentration was maintained at 40%. 

The fermentor was inoculated as described above, 

3 5 and the culture was grown at 30 'C to a cell density of 
0.5 to 1.0 X 10^0 cells/ml (optical density [H^qq] of 
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15). The grovrbh 'temperature was shifted to 38 *C to 
induce the synthesis of MET284 Tma DNA polymerase. The 
temperature shift increases the copy number of the 
pTmalS plasmid and simultaneously derepresses the 
5 lambda promoter controlling transcription of the 

modified Tma DNA polymerase gene through inactivation 
of the temperature-sensitive cl repressor encoded by 
the defective prophage lysogen in the host. 

The cells were grown for 6 hours to an optical 
10 density of 37 (A5go) and harvested by centrifugation. 
The cell mass (ca. 95 g/1) was resuspended in an 
equivalent volume of buffer containing 50 mM Tris-Cl, 
pH 7^6, 20 mM EDTA and 20% (w/v) glycerol. The 
suspension was slowly dripped into liquid nitrogen to 
15 freeze the suspension as "beads" or small pellets. The 
frozen cells were stored at -70 *C. 

To 200 g of frozen beads (containing 100 g wet 
weight cell) were added 100 ml of IX TE (50 mM Tris-Cl, 
pH 7.5, 10 mM EDTA) and DTT to 0.3 mM, PMSF to 2.4 mM, 
20 leupeptin to 1 ]ig/ml and TLCK (a protease inhibitor) to 
0.2 mM. The sample was thawed on ice and xiniformly 
resuspended in a blender at low speed. The cell 
suspension was lysed in an Aminco french pressure cell 
at 20,000 psi. To reduce viscosity, the lysed cell 
25 sample was sonicated 4 times for 3 min. each at 50% 
duty cycle and 70% output. The sonicate was adjusted to 
550 ml with IX TE containing 1 mM DTT, 2.4 mM PMSF, 1 
pg/ml leupeptin and 0.2 mM TLCK (Fraction I) . After 
addition of ammonium sulfate to 0.3 M, the crude lysate 
30 was rapidly brought to 75 'C in a boiling water bath and 
transferred to a 75 'C water bath for 15 min. to 
denature and inactivate E. coli host proteins. The 
heat-treated sample was chilled rapidly to 0*C and 
incubated on ice for 20 min. Precipitated proteins and 
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cell meinbranes were removed by centrifugation at 20,000 
X G for 30 min. at 5'C and the supernatant (Fraction 
11) saved. 

The heat-treated supernatant (Fraction 11) was 
5 treated with polyethyleneimine (PEl) to remove most of 
the DNA and RNA. Polymin P (34.96 ml of 10% [w/v] , pH 
7.5) was slowly added to 437 ml of Fraction II at o*C 
while stirring rapidly. After 30 min. at 0*C, the 
sample was centrifuged at 20,000 X G for 30 min. The 

10 supernatant (Fraction III) was applied at 80 ml/hr to a 
100 ml phenylsepharose column (3.2 x 12.5 cm) that had 
been equilibrated in 50 mM Tris-Cl, pH 7.5, 0.3 M 
ammonium sulfate, 10 mM EDTA, and 1 mM DTT. The column 
was washed with cO^out 200 ml of the same buffer (A280 

15 to baseline) and then with 150 ml of 50 mM Tris-Cl, pH 
7.5, 100 mM KCl, 10 mM EDTA and 1 mM DTT. The MET284 
2sa DNA polymerase was then eluted from the colximn with 
buffer containing 50 mM Tris-Cl, pH 7.5, 2 M urea, 20% 
(w/v) ethylene glycol, 10 mM EDTA, and 1 mM DTT, and 

20 fractions containing DNA polymerase activity were 
pooled (Fraction IV) . 

Fraction IV is adjusted to a conductivity 
equivalent to 50 mM KCl in 50 mM Tris-Cl, pH 7.5, 1 mM 
EDTA, and 1 mM DTT. The sample was applied (at 9 

25 ml/hr) to a 15 ml heparin-sepharose column that had 
been equilibrated in the same buffer. The column was 
washed with the same buffer at ca. 14 ml/hr (3.5 column 
volumes) and eluted with a 150 ml 0.05 to 0.5 M KCl 
gradient in the same buffer. The DNA polymerase 

30 activity eluted between 0.11-0.22 M KCl. Fractions 
containing the pTmalS encoded modifed Tma DNA 
polymerase are pooled, concentrated, and diafiltered 
against 2.5X storage buffer (50 mM Tris-Cl, pH 8.0, 250 
mM KCl, 0.25 mM EDTA, 2.5 mM DTT, and 0.5% Tween 20), 

35 subsequently mixed with 1.5 volximes of sterile 80% 
(w/v) glycerol, and stored at -20**C. Optionally, the 
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heparin sepharose-eluted DNA polymerase or ^e phenyl 
sepharose-eluted DNA polymerase can be dialyzed or 
adjusted to a conductivity equivalent to 50 mM KCl in 
50 mM Tris-Cl, pH 7*5^ 1 mM DTT, 1 mM EDTA, and 0.2% 
5 Tween 20 and applied (1 mg protein/ml resin) to an 
affigel blue column that has been equilibrated in the 
same buffer* The colxunn is washed with three to five 
column volumes of the same buffer and eluted with a 10 
coliimn volume KCl gradient (0*05 to 0.8 M) in the same 
10 buffer. Fractions containing DNA polymerase activity 
(eluting between 0.25 and 0.4 M KCl) are pooled, 
concentrated, diafiltered, and stored as above. 

The relative thermoresistance of various DNA 
polymerases has been compared. At 97. 5 'C the half-life 
15 of native Tma DNA polymerase is more than twice the 
half-life of either native or recombinant Tag DNA 
(i.e., AmpliTag ) DNA polymerase. Surprisingly, the 
half-life at 97. 5 'C of MET284 Tma DNA polymerase is 2.5 
to 3 times longer than the half-life of native Tma DNA 
20 polymerase. 

PGR tubes containing 10 mM Tris-Cl, pH 8.3, and 1.5 
mM MgCl2 (for Tag or native Tma DNA polymerase) or 3 mM 
MgCl2 (for MET284 Tma DNA polymerase), 50 mM KCl (for 
Tag , native Tma and MET284 Tma DNA polymerases) or no 

25 KCl (for MET284 Tma DNA polymerase), 0.5 yM each of 
primers PCROl and PCR02, 1 ng of lambda template DNA, 
200 }iM of each dNTP except dCTP, and 4 units of each 
enzyme were incubated at 97. 5 'C in a large water bath 
for times ranging from 0 to 60 min. Samples were 

30 withdrawn with time, stored at 0*C, and 5 pi assayed at 
75 'C for 10 min. in a standard activity assay for 
residual activity . 

Tag DNA polymerase had a half-life of about 10 min. 
at 97. 5 'C, while native Tma DNA polymerase had a 

35 half-life of about 21 to 22 min. at 97. 5 'C. 
Surprisingly, the MET284 form of Tma DNA polymerase had 
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a significanlty longer half-life (50 to 55 minO than 
either las or native 2toa DNA polymerase. The improved 
thermoresi stance of MET284 Tma DNA polymerase will find 
applications in PGR, particularly where G+c-rich 
5 targets are difficult to amplify because the 



denaturation of target and PGR product sequences leads 
to enzyme inactivation. 

PGR tubes containing 50 jil of 10 mM Tris-Cl, pH 

10 8.3, 3 mM MgCl2, 200 yM of each dNTP, 0.5 ng 
bacteriophage lambda DNA, 0.5 yK of primer PCROl, 4 
units of MET284 2ma DNA polymerase, and 0.5 iiM of 
primer PCR02 or PLIO were cycled for 25 cycles using 
Tden of ^e-C for 1 min. and T^^^^l-extend of 60-C for 

15 2 min. Lambda DNA template, deoxynucleotide stock 
solutions, and primers PCROl and PCR02 were part of the 
PECI GeneAmp kit. Primer PLlO has the sequence: 
5 ' -GGCGTACCTTTGTCTCACGGGCAAC-3 ' (SEQ ID NO: 25) and is 

complementary to bacteriophage lambda nucleotides 
20 8106-8130. 

The primers PCROl and PCR02 amplify a 500 bp 
product from lambda. The primer pair PCROl and PLIO 
amplify a 1 kb product from lambda. After 
amplification with the respective primer sets, 5 |il 

25 aliquots were subjected to agarose gel electrophoresis 
and the specific intended product bands visualized with 
ethiditim bromide staining. Abundant levels of product 
were generated with both primer sets, showing that 
MET284 Tma DNA polymerase successfully amplified the 

30 intended target sequence. 
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Example 4 

Expression of Truncated Tma DNA PolyiDerase 

5 To express a 5' t:o 3' exonuclease deficient form of 

Tma DNA polymerase which initiates translation at MET 
140 the coding region corresponding to amino acids 1 
through 139 was deleted from the expression vector. 
The protocol for constructing such a deletion is 
10 similar to the construction described in Examples 2 
and 3: a shortened gene fragment is excised and then 
reinserted into a vector from which a full length 
fragment has been excised. However, the shortened 
fragment can be obtained as a PGR eunplif ication product 
15 rather than purified from a restriction digest. This 
methodology allows a new upstreetm restriction site (or 
other sequences) to be incorporated where useful. 

To delete the region up to the methionine codon at 
position 140, an Sph I site was introduced into pTmal2-l 
20 and pTmalS using PGR. A forward primer corresponding 
to nucleotides 409-436 of Tma DNA polymerase SEQ ID 
NO: 3 (FL63) was designed to introduce an Sph I site just 
upstream of the methionine codon at position 14 0. The 
reverse primer corresponding to the complement of 
25 nucleotides 608-634 of Tma DNA polymerase SEQ ID NO: 3 
(FL69) was chosen to include an Xba l site at position 
621. Plasmid pTmal2-l linearized with Sma l was used as 
the PGR template, yielding an approximate 225 bp PGR 
product • 

30 Before digestion, the PGR product was treated with 

50 iig/ml of Proteinase K in PGR reaction mix plus 0.5% 
SDS and 5 mM EDTA. After incubating for 30 minutes at 
37 'G, the Proteinase K was heat inactivated at 68 'C for 
10 minutes. This procedure eliminated any Tag 

35 polymerase bound to the product that could inhibit 
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subsequent restriction digests. The buffer was changed 
to a TE buffer, and the excess PCR primers were removed 
with a Centricon 100 microconcentrator. 

The amplified fragment was digested with Sph l. then 
5 treated with Klenow to create a blunt end at the 
SEliI-cleaved end, and finally digested with Xba l, The 
resulting fragment was ligated with plasmid pTmal3 
(pTmal2-l would have been suitable) that had been 
digested with SS9X, repaired with Klenow, and then 

10 digested with 2^1- The ligation yielded an in-frame 
coding sequence with the region following the Nco l site 
(at the first methionine codon of the coding sequence) 
and the introduced SphI site (upstream of the 
methionine codon at position 14 0 ) deleted . The 

15 resulting expression vector was designated pTmal6. 

The primers used in this example are given below 
and in the Sequence Listing section. 

SecruencQ 

5 'GATAAAGGCATGCTTCAGCTTGTGAACG 
5 ' TGTACTTCTCTAGAAGCTGAACAGCAG 

Example 5 
25 

Elimination of Undesired RBS in 
MET1 40 Expression Vgctors 

Reduced expression of the MET140 form of Tma DNA 
30 polymerase can be achieved by eliminating the ribosome 
binding site (RBS) upstream of the methionine codon at 
position 140. The RBS was be eliminated via 
oligonucleotide site-directed mutagenesis without 
changing the amino acid sequence. Taking advantage of 
35 the redundancy of the genetic code, one can make 
changes in the third position of codons to alter the 



Primer SEO ID NO; 

20 

FL63 SEQ ID NO: 26 

FL69 SEQ ID NO: 27 
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nucleic acid sequence, thereby eliminating the RBS, 
without changing the cunino acid sequence of the encoded 
protein. 

A mutagenic primer (FL64) containing the modified 
5 sequence was synthesized and phosphorylated. 
Single-stranded pTma09 (a full length clone having an 
Nco l site) was prepared by coinfecting with the helper 
phage R408, commercially available from Stratagene. A 
"gapped duplex" of single stranded pTma09 and the large 
10 fragment from the Pvu II digestion of pBS13+ was created 
by mixing the two plasmids, heating to boiling for 2 
minutes, and cooling to 65 *C for 5 minutes. The 
phosphorylated primer was then annealed with the 
"gapped duplex" by mixing, heating to 80*0 for 2 
15 minutes, and then cooling slowly to room temperature. 
The remaining gaps were filled by extension with Klenow 
and the fragments ligated with T4 DNA ligase, both 
reactions taking place in 200 of each dNTP and 40 
ATP in standard salts at 37 'C for 30 minutes. 
20 The resulting circular fragment was transformed 

into DGlOl host cells by plate transformations on 
nitrocellulose filters. Duplicate filters were made 
and the presence of the correct plasmid was detected by 
probing with a Y''^P"Phosphorylated probe (FL65) . The 
25 vector that resulted was designated pTmal9. 

The RBS minus portion from pTmal9 was cloned into 
pTmal2-l via an Nco l/ Xba l fragment switch. Plasmid 
pTmal9 was digested with Nco l and Xbal, and the 62 0 bp 
fragment was purified by gel electrophoresis, as in 
30 Example 3, above. Plasmid pTmal2-l was digested with 
Ncol, Xba l , and Xcml. The Xcm l cleavage inactivates 
the RBS+ fragment for the stibsequent ligation step, 
which is done under conditions suitable for ligating 
"sticky" ends (dilute ligase and 40 }xK ATP) . Finally, 
35 the ligation product is transformed into DG116 host 
cells for expression and designated pTmal9-RBS. 
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The oligonucleotide sequences used in this example 
are listed below and in the Sequence Listing section. 

Oliqo SEC ID NO: Sequence 

5 

FL64 SEQ ID NO: 28 5 ' CTGAAGCATGTCTTTGTCACCGGT- 

TACTATGAATAT 

FL65 SEQ ID NO: 29 5'TAGTAACCGGTGACAAAG 

10 Example 6 

Expression of Truncated Tma DNA Polymerases 

MET-ASP21 and MET-GI,U74 

15 To effect translation initiation at the aspartic 

acid codon at position 21 of the Tma DNA polymerase gene 
coding sequence, a methionine codon is introduced before 
the codon, and the region from the initial Nco l site to 
this introduced methionine codon is deleted. Similar to 

20 Example 4, the deletion process involved PGR with the 
same downstream primer described above (FL69) and an 
upstream primer (FL66) designed to incorporate an Nco l 
site and a methionine codon to yield a 570 base pair 
product . 

25 The amplified product was concentrated with a 

Centricon-100 microconcentrator to eliminate excess 
primers and buffer. The product was concentrated in a 
Speed Vac concentrator and then resuspended in the 
digestion mix. The amplified product was digested with 

30 Ncol and Xbal. Likewise, pTmal2-l, pTmal3, or 
pTmal9-RBS was digested with the same two restriction 
enzymes, and the digested, amplified fragment is ligated 
with the digested expression vector. The resulting 
construct has a deletion from the Ncol site upstream of 

35 the start codon of the native Tma coding sequence to the 
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new methionine codon introduced upstream of the aspartic 
acid codon at position 21 of the native Tma coding 
secjuence • 

Similarly^ a deletion mutant was created such that 
5 translation initiation begins at Glu74, the glutamic 
acid codon at position 74 of the native Tma coding 
sequence. An upstream primer (FL67) is designed to 
introduce a methionine codon and an Nco l site before 
61u74. The downstreeun primer and cloning protocol used 
10 are as described above for the MET-ASP21 construct. 

The upstream primer sequences used in this example 
are listed below and in the Sequence Listing section. 



Oliao SEC ID NO; Sequence 

15 

FL66 SEQ ID NO: 30 5 ' CTATGCCATGGATAGATC6CTT- 

TCTACTTCC 

FL67 SEQ ID NO: 31 5 ' CAAGCCCATGGAAACTTACAAG- 

GCTCAAAGA 

20 

Example 7 



Expression of Truncated Taf Polymerase 



25 Mutein forms of the Taf polymerase lacking 5' to 3' 

exonuclease activity were constructed by introducing 
deletions in the 5 'end of the Taf polymerase gene. 
Both 279 and 417 base pair deletions were created using 
the following protocol; an expression plasmid was 

30 digested with restriction enzymes to excise the desired 
fragment, the fragment ends were repaired with Klenow 
and all four dNTP/s, to produce blunt ends, and the 
products were ligated to produce a new circular plasmid 
with the desired deletion. To express a 93 kilodalton, 

35 5' to 3' exonuclease-def icient form of Taf polymerase, 
a 279 bp deletion comprising amino acids 2-93 was 
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generated. To express an 88 kilodalton, 5' to 3' 
exonuclease-def icient form of Taf polymerase, 417 bp 
deletion comprising amino acids 2-139 was generated. 

To create a plasmid with codons 2-93 deleted, 
5 pTaf03 was digested with lisol and Nde l and the ends 
were repaired by Klenow treatment. The digested and 
repaired plasmid was diluted to 5 iig/ml and ligated 
iinder blunt end conditions. The dilute plasmid 
concentration favors intramolecular ligations. The 

10 ligated plasmid was transformed into DG116. 
Mini-screen DNA preparations were siibjected to 
restriction analysis and correct plasmids were 
confirmed by DNA sequence analysis. The resulting 
expression vector created by deleting a segment from 

15 pTaf03 was designated pTaf09. A similar vector created 
from pTafOS was designated pTaflO. 

Expression vectors also were created with codons 
2-139 deleted. The same protocol was used with the 
exception that the initial restriction digestion was 

20 performed with Ncol and Bal ll. The expression vector 
created from pTaf03 was designated pTafll and the 
expression vector created from pTafOS was designated 
pTaf 12 . 

25 Example 8 

Derivation and Expression of 5' to 3' 
Exonuclease-Def icient, Thermostable DNA 
Polymerase of Thermus species, Z05 
30 Comprising Amino Acids 292 Through 834 

To obtain a DNA fragment encoding a 5' to 3' 
exonuclease-def icient thermostable DNA polymerase from 
Thermus species Z05, a portion of the DNA polymerase 
35 gene comprising amino acids 292 through 834 is 
selectively cimplified in a PGR with forward primer 
TZA292 and reverse primer TZROl as follows: 
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50 pinoles TZA292 
50 pmoles TZROl 

10 ng Thermus sp* Z05 genomic DNA 
2.5 xinit:s AmpliTag DNA polymerase 
5 50 each dATP, dGTP, dCTP, dTTP 

in an 80 }xl solution containing 10 mM Tris-HCl pH 8.3, 
50 mM KCl and overlaid with 100 ]il of mineral oil. The 
reaction was initiated by addition of 20 til containing 

10 7.5 mM MgCl2 after the tubes had been placed in an 80*0 
preheated cycler. 

The genomic DNA was digested to completion with 
restriction endonuclease Asp 7l8 ^ denatured at 98 *C for 
5 minutes and cooled rapidly to 0*C. The ssuaple was 

15 cycled in a Perkin-Elmer Cetus Thermal Cycler according 
to the following profile: 

STEP CYCIiE to 96 *C and hold for 20 seconds. 
STEP CYCLE to 55 ^"0 and hold for 30 seconds. 
20 RAMP to 72 "C over 30 seconds and hold for 1 minute. 

REPEAT profile for 3 cycles. 

STEP CYCLE to 96 'C and hold for 20 seconds. 
STEP CYCLE to 65 •C and hold for 2 minutes. 
25 REPEAT profile for 25 cycles. 

After last cycle HOLD for 5 minutes. 

The intended 1.65 kb PCR product is purified by 
agarose gel elecctrophoresis, and recovered following 

30 phenol-chloroform extraction and ethanol precipitation. 
The purified product is digested with restriction 
endonucleases Ndel and Bal ll and ligated with 
Ndel/BamHI -digested and dephosphorylated plasmid vector 
pDG164 (U.S. Serial No. 455,967, filed December 22, 

35 1989, Example 6B incorporated herein by reference). 
Ampicillin-resistant transf ormants of coli strain 
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DG116 are selected at 30 'C and screened for the desired 
recombinant plasmid. Plasmid pZ05A292 encodes a 544 
amino acid, 5' to 3' exonuclease-def icient Thermus sp. 
Z05 thermos t£Q3le DNA polymerase analogous to the pLS68 
5 encoded protein of Example 2. The DNA polymerase 
activity is purified as in Example 2. The piirified 
protein is deficient in 5' to 3' exonuclease activity, 
is more thermoresistant than the corresponding native 
enzyme and is particularly useful in PCR of G+c-rich 
10 templates. 



Primer SEP ID NO; 



SEOUENCE 



TZA292 SEQ ID NO: 32 GTCGGCATATGGCTCCTGCTCCTCTTGAGGA< 
^ 5 GGCCCCCTGGCCCCCGCC 

TZROl SEQ ID NO: 33 GACGCAGATCTCAGCCCTTGGCGGAAAGCCA- 

GTCCTC 



20 



25 



Example 9 

Derivation and Expression of 5' to 3' 
Exonuclease-Def icient. Thermostable DNA 

Polymerase of Thermus species SPS17 
Comprising Amin o Acids 288 Through 830 



To obtain a DNA fragment encoding 5' to 3' 
exonuclease-def icient thermostable DNA polymerase from 
Thermus species SPS17, a portion of the DNA polymerase 
30 gene comprising amino acids 288 through 830 is 
selectively amplified in a PGR with forward primer 
TSA288 and reverse primer TSROl as follows: 
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50 pmoles TSA288 
50 pinoles TSROl 

10 ng Thermus sp. SPS17 genomic DNA 
2.5 tini'ts AmpliTag DNA polymerase 
5 50 jiM each dATP, dGTP, dCTP, dTTP 

in an 80 ^xl solution containing 10 xoM Tris-HCl pH 8.3, 
50 mM KCl and overlaid with lOO ]il of mineral oil. The 
reaction was initiated by addition of 20 |tl containing 
10 7.5 mM MgCl2 after the tubes had been placed in an 80 *C 
preheated cycler. 

The genomic DNA was denatured at 98 *C for 5 minutes 
and cooled rapidly to 0*C. The sample was cycled in a 
15 Perkin-Elmer Cetus Thermal Cycler according to the 
following profile: 

STEP CYCLE to 96 *C and hold for 20 seconds. 
STEP CYCLE to 55 'C, and hold for 30 seconds. 
20 RAMP to 72 'C over 30 seconds and hold for 1 minute. 

REPEAT profile for 3 cycles. 

STEP CYCLE to se^'C and hold for 20 seconds. 
STEP CYCLE to 65 "C and hold for 2 minutes. 
25 REPEAT profile for 25 cycles. 

After last cycle HOLD for 5 minutes. 

The intended 1.65 kb PCR product is purified by 
agarose gel electrophoresis, and recovered following 

30 phenol -chloroform extraction and ethanol precipitation. 
The purified product is digested with restriction 
endonucleases Ndel and Bal ll and ligated with 
Nde l/ BamH I-diaested and dephosphorylated plasmid vector 
PDG164 (U.S. Serial No. 455,967, filed December 12, 

35 1989, Example 6B) . Ampicillin- resistant transformants 
of coli strain DG116 are selected at 30 "C and 
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PSPSA288 encodes a 544 amino acid, 5 ' to 3 ' 
exonuclease-deficient Thermus sp. SPS17 thenaostable 
DNA polymerase analogous to the pLSGS encoded protein 
5 of Example 2. The DNA polymerase activity is purified 
as in Example 2. The purified protein is deficient in 
5' to 3' exonuclease activity, is more thermoresistant 
than the corresponding native enzyme and is 
particularly useful in PGR of G+C-rich templates. 

10 

Primer SEP ID NOs SEQUENCE 

TSA288 SEQ ID NO: 34 GTCGGCATATGGCTCCTAAAGAAGCTGAGGA- 

GGCCCCCTGGCCCCCGCC 

15 

TSROl SEQ ID NO: 35 GACGCAGATCTCAGGCCTTGGCGGAAAGCCA- 

GTCCTC 



Example 10 

20 

Derivation and Expression of 5' to 3' 
Exonuclease-Deficient, Thermostable DNA 

Polymerase of Thermus Thermophilus 
Comprisin g Amino Acids 292 Through 834 

25 

To obtain a DNA fragment encoding a 5 ' to 3 ' 
exonuclease-deficient thermostable DNA polymerase from 
T^ieimus thermophilus, a portion of the DNA polymerase 
gene comprising amino acids 292 through 834 is 
30 selectively amplified in a PGR with forward primer 
TZA292 and reverse primer DG122 as follows; 



50 pmoles TZA292 
50 pmoles DG122 

1 ng EcoRI digested plasmid pLSG22 
2.5 units AmpliTaq DNA polymerase 
50 each dATP, dGTP, dCTP^ dTTP 
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in an 80 }il solut:ion containing 10 mM Tris-HCl pH 8.3, 
50 mM KCl and overlaid with 100 }xl of mineral oil. The 
reaction was initiated by addition of 20 \il containing 
7.5 mM MgCl2 after the ttibes had been placed in an 80 *C 
5 preheated cycler « 

Plasmid pLSG22 (U.S. Serial No. 455,967, filed 
December 22, 1989, Example 4A, incorporated herein by 
reference) was digested to completion with restriction 
10 endonuclease EcoRI, denatured at 98 *C for 5 minutes and 
cooled rapidly to 0*C. The sample was cycled in a 
Perkin-Elmer Cetus Thermal Cycler according to the 
following profile: 

15 STEP CYCLE to 96 and hold for 20 seconds. 

STEP CYCLE to 55 *C and hold for 30 seconds. 

RAMP to 72 'C over 30 seconds and hold for 1 minute. 

REPEAT profile for 3 cycles. 

20 STEP CYCLE to 96 *C and hold for 20 seconds. 

STEP CYCLE to 65 'C and hold for 2 minutes. 

REPEAT profile for 20 cycles. 

After last cycle HOLD for 5 minutes. 

25 The intended 1.66 kb PCR product is purified by 

agarose gel electrophoresis, and recovered following 
phenol-chloroform extraction and ethanol precipitation. 
The purified product is digested with restriction 
endonucleases Ndel and Bal ll and ligated with 

3 0 Ndel/BainHI -digested and dephosphorylated plasmid vector 
pDG164 (U.S. Serial No. 455,967, filed December 12, 
1989, Excimple 6B) . Ampicillin- resistant transformants 
of E^ coli strain DG116 are selected at 30 'C and 
screened for the desired recombinant plasmid. Plasmid 

35 pTTHA292 encodes a 544 amino acid, 5' to 3' 
exonuclease-def icient Thermus thermophilus thermostable 
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DNA polymerase analogous to the pLSGS encoded protein 
of Example 2. The DMA polymerase activity is purified 
as in Example 2. The purified protein is deficient in 
5' to 3' exonuclease activity, is more thermoresistant 
5 than the corresponding native enzyme and is 
particularly useful in PCR of G+c-rich templates. 

Priiqer SEC ID NO; SEQUENCE 

10 TZA292 SEQ ID NO: 32 GTCGGCATATGGCTCCTGCTCCTCTTGAGGA- 

GGCCCCCTGGCCCCCGCC 

DG122 SEQ ID NO: 36 CCTCTAAACGGCAGATCTGATATCAACCCTT- 

GGCGGAAAGC 



15 



Example 11 



Derivation and Expression of 5' to 3' 
Exonuclease-Deficient, ThermosteUsle DNA 
20 Polymerase of Thermos ipho Africanus 

Comprising Amino Acids 285 Through 892 

To obtain a DNA fragment encoding a 5' to 3' 
exonuclease-deficient thermostable DNA polymerase from 
25 Theymosipho africanus , a portion of the DNA polymerase 
gene comprising amino acids 285 through 892 is 
selectively amplified in a PCR with forward primer 
TAFI285 and reverse primer TTIFROI as follows: 



3 0 50 pmoles TAFI285 

50 pmoles TAFROl 
1 ng plasmid pBSM:TafRV3'^ DNA 
2.5 vmits AmpliTaq DNA polymerase 
50 |iM each dATP, dGTP, dCTP, dTTP 

35 

in an 80 }il solution containing 10 mM Tris-HCl pH 8.3, 
50 mM KCl and overlaid with 100 ^l of mineral oil. The 
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reaction was initia'ted by addition of 20 yil containing 
7.5 xnM MgCl2 after the txibes had been placed in an 80 *C 
preheated cycler. 

5 Plasmid pBSM:TafRV'3 (obtained as described in 

CETUS CASE 2583.1, EX 4, p53, incorporated herein by 
reference) was digested with EcoR I to completion and 
the DNA was denatured at 98 *C for 5 minutes and cooled 
rapidly to 0*C. The sample was cycled in a 
10 PerJcin-Elmer Cetus Thermal Cycler according to the 
following profile: 

STEP CYCLE to 95 *C and hold for 30 seconds. 
STEP CYCLE to 55 *C and hold for 30 seconds. 
15 RAMP to 72 'C over 30 seconds and hold for 1 minute. 

REPEAT profile for 3 cycles. 

STEP CYCLE to 95 'C and hold for 30 minutes. 
STEP CYCLE to 65 'C and hold for 2 mxnutes. 
20 REPEAT profile for 20 cycles. 

After last cycle HOLD for 5 minutes. 

The intended 1.86 kb PCR product is purified by 
agarose gel electrophoresis, and recovered following 

25 phenol -chloroform extraction and ethanol precipitation. 
The purified product is digested with restriction 
endonucleases Nde l and BamH I and ligated with 
Ndel/BaitLHI -digested and dephosphorylated plasmid vector 
pDG164 (U.S. Serial No. 455,967, filed December 22, 

3 0 1989, Exeunple 6B) . Ampicillin- resistant transf ormants 
of coll strain DG116 are selected at 30'C and 

screened for the desired recombinant plasmid. Plasmid 
pTAFI285 encodes a 609 amino acid, 5' to 3' 
exonuclease-def icient Thermosipho af ricanus 

35 thermostable DNA polymerase analogous to the 

* 

pTMA15-encoded protein of Example 3 . The DNA 
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polymerase activity is purified as in Example 3. The 
purified protein is deficient in 5' to 3' exonuclease 
activity, is more thermoresistant than the 
corresponding native enzyme and is particularly useful 
5 in PGR of G+C-rich templates. 

primer SEP ID NO; SEQUENCE 



TAF1285 SEQ ID NO: 37 

10 

TAFROl SEQ ID NO: 38 



GTCGGCATATGATTAAAGAACTTAATTTACA- 
AGAAT^TTAGAAAAGG 

CCTTTACCCCAGGATCCTCATTCCCACTCTT- 
TTCCATAATAAACAT 



15 The foregoing written specification is considered 

to be sufficient to enable one skilled in the art to 
practice the invention. The present invention is not 
to be limited in scope by the cell lines deposited, 
since the deposited embodiment is intended as a single 

20 illustration of one aspect of the invention and any 
cell lines that are functionally equivalent are within 
the scope of this invention. The deposits of materials 
therein does not constitute an admission that the 
written description herein contained is inadequate to 

25 enable the practice of any aspect of the invention, 
including the best mode thereof, nor are the deposits 
to be construed as limiting the scope of the claims to 
the specific illustrations that they represent. 
Indeed, various modifications of the invention in 

30 addition to those shown and described herein will 
become apparent to those skilled in the art from the 
foregoing description and fall within the scope of the 
appended claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Gelfand, David H. 

Abramson, Richard D. 

(ii) TITLE OF INVENTION: 5' TO 3' EXONUCLEASE MUTATIONS OF 
THERMOSTABLE DNA POLYMERASES 

(iii) NUMBER OF SEQUENCES: 38 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Cetus Corporation 

(B) STREET: 1400 Fifty- third Street 

(C) CITY: Emeryville 

(D) STATE: California 
(F) ZIP: 94608 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS/MS -DOS 

(D) SOFTWARE: WordPerfect 5.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: WO 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 590,490 

(B) FILING DATE: 28-SEP-1990 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 590,466 

(B) FILING DATE: 28 -SEP- 1990 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 590.213 

(B) FILING DATE: 28-SEP-1990 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 523,394 

(B) FILING DATE: 15 -MAY- 1990 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 143.441 

(B) FILING DATE: 12 -JAN- 1988 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 063.509 

(B) FILING DATE: 17-JUN-1987 
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(vli) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 899,241 

(B) FILING DATE: 22-AUG-1986 

(vii) PRIOR APPLICATION DATA: * 

(A) APPLICATION NUMBER: US 746,121 

(B) FILING DATE: 15 -AUG- 1991 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: WO PCT/US90/07641 

(B) FILING DATE: 21-DEC-1990 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 585,471 

(B) FILING DATE: 20-SEP-1990 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 455,611 

(B) FILING DATE: 22-DEC-1989 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 609,157 

(B) FILING DATE: 02 -NOV- 1990 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 557,517 

(B) FILING DATE: 24- JUL- 1990 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Sias Ph.D, Stacey R. 

(B) REGISTRATION NUMBER: 32,630 

(C) REFERENCE/DOCKET NUMBER: Case No. 2580 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: 415-420-3300 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2499 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Thermus aquaticus 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: l.,2496 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

ATG AGG GGG ATG CTG CCC CTC TTT GAG CCC AAG GGC CGG GTC CTC CTG 48 

Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
15 10 15 

GTG GAC GGC CAC CAC CTG GCC TAG CGC ACC TTC CAC GCC CTG AAG GGC 96 

Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Gly 

20 25 30 

CTC ACC ACC AGC CGG GGG GAG CCG GTG CAG GCG GTC TAG GGC TTC GCC 144 

Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 45 

AAG AGC CTC CTC AAG GCC CTC AAG GAG GAC GGG GAC GCG GTG ATC GTG 192 

Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val lie Val 
50 55 60 

GTC TTT GAC GCC AAG GCC CCC TCC TTC CGC CAC GAG GCC TAG GGG GGG 240 

Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly 
65 70 75 80 

TAG AAG GCG GGC CGG GCC CCC ACG CCG GAG GAC TTT CCC CGG CAA CTC 288 

Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin Leu 

85 90 95 

GCC CTC ATC AAG GAG CTG GTG GAC CTC CTG GGG CTG GCG CGC CTC GAG 336 

Ala Leu lie Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu 

100 105 110 

GTC CCG GGC TAG GAG GCG GAC GAC GTC CTG GCC AGC CTG GCC AAG AAG 384 

Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys 
115 120 125 

GCG GAA AAG GAG GGC TAG GAG GTC CGC ATC CTC ACC GCC GAC AAA GAC 432 

Ala Glu Lys Glu Gly Tyr Glu Val Arg lie Leu Thr Ala Asp Lys Asp 
130 135 140 

CTT TAG CAG CTC CTT TCC GAC CGC ATC CAC GTC CTC CAC CCC GAG GGG 480 

Leu Tyr Gin Leu Leu Ser Asp Arg lie His Val Leu His Pro Glu Gly 
145 150 155 160 
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TAG CTC ATC ACC CCG GCC TGG CTT TGG GAA AAG TAG GGC CTG AGG CCC 528 

Tyr Leu He Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro 

165 170 175 

GAG GAG TGG GCC GAG TAG CGG GCC CTG ACC GGG GAG GAG TCC GAG AAC 576 

Asp Gin Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn 

180 185 190 

CTT CCC GGG GTC AAG GGC ATC GGG GAG AAG ACG GCG AGG AAG CTT CTG 624 

Leu Pro Gly Val Lys Gly He Gly Glu Lys Thr Ala Arg Lys Leu Leu 
195 200 205 

GAG GAG TGG GGG AGG CTG GAA GCC CTC CTC AAG AAC CTG GAC CGG CTG 672 

Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu 
210 215 220 

AAG CCC GCC ATC CGG GAG AAG ATC CTG GCC CAC ATG GAC GAT CTG AAG 720 

Lys Pro Ala He Arg Glu Lys He Leu Ala His Met Asp Asp Leu Lys 

230 235 240 

CTC TCC TGG GAC CTG GCC AAG GTG GGC ACC GAC CTG CCC CTG GAG GTG 768 

Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val 

245 250 255 

GAC TTC GCC AAA AGG CGG GAG CCC GAC CGG GAG AGG CTT AGG GCC TTT 816 

Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe 

260 265 270 

CTG GAG AGG CTT GAG TTT GGC AGC CTC CTC CAC GAS TTC GGC CTT CTG 864 

Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu 
275 280 285 

GAA AGC CCC AAG GCC CTG GAG GAG GCC CCC TGG CCC CCG CCG GAA GGG 912 

Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Glv 
290 295 300 

GCC TTC GTG GGC TTT GTG CTT TCC GGC AAG GAG CCC ATG TGG GCC GAT 960 

Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp 
305 310 315 320 

CTT CTG GCC CTG GCC GCC GCC AGG GGG GGC CGG GTC CAC CGG GCC CCC 1008 

Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro 

325 330 335 
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GAG COT TAT AAA GCC CTC AGG GAG CTG AAG GAG GGG CGG GGG CTT CTG 1056 

Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu 

340 345 350 

GCC AAA GAC CTG AGC GTT CTG GCC CTG AGG GAA GGC CTT GGC CTC CCG 1104 

Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly ;^u Pro 
355 360 365 

CCC GGC GAC GAC GCC ATG CTC CTC GCC TAG CTC CTG GAC CCT TCC AAC 1152 

Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn 
370 375 380 

ACC ACC CCC GAG GGG GTG GCC CGG CGC TAC GGC GGG GAG TGG AGG GAG 1200 

Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu 
385 390 395 400 

GAG GCG GGG GAG CGG GCC GCC CTT TCC GAG AGG CTC TTC GCC AAC CTG 1248 

Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu 

405 410 415 

TGG GGG AGG CTT GAG GGG GAG GAG AGG CTC CTT TGG CTT TAC CoG GAG 1296 

Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyx Arg Glu 

420 425 430 

GTG GAG AGG CCC CTT TCC GCT GTC CTG GCC CAC ATG GAG GCC ACG GGG 1344 

Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly 
435 440 445 

GTG CGC CTG GAC GTG GCC TAT CTC AGG GCC TTG TCC CTG GAG GTG GCC 1392 

Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala 
450 455 460 

GAG GAG ATC GCC CGC CTC GAG GCC GAG GTC TTC CGC CTG GCC GGC CAC 1440 

Glu Glu lie Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Cly His 
465 470 475 480 

CCC TTC AAC CTC AAC TCC CGG GAC CAG CTG GAA AGG GTC CTC TTT GAC 1488 

Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu Phe Asp 

485 490 495 

GAG CTA GGG CTT CCC GCC ATC GGC AAG ACG GAG AAG ACC GGC AAG CGC 1536 

Glu Leu Gly Leu Pro Ala lie Gly Lys Thr Glu Lys Thr Gly Lys Arg 

500 505 510 
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TCC ACC AGC GCC GCC GTC CTG GAG GCC CTC CGC GAG GCC GAG CCC ATC 1584 

Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro He 
515 520 525 

GTG GAG AAG ATC CTG CAG TAG CGG GAG CTC ACC AAG CTG AAG AGC ACC 1632 

Val Glu Lys He Leu Gin Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr 
530 535 540 

TAG ATT GAG CCC TTG CCG GAG CTC ATC CAC CCC AGG ACG GGC CGC CTC 1680 

Tyr He Asp Pro Leu Pro Asp Leu He His Pro Arg Thr Gly Arg Leu 

550 555 560 

CAC ACC CGC TTC AAC CAG ACG GCC ACG GCC ACG GGC AGG CTA ACT AGC 1728 

His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser 

565 570 575 

TCC GAT CCC AAC CTC CAG AAC ATC CCC GTC CGC ACC CCG CTT GGG CAG 1776 

Ser Asp Pro Asn Leu Gin Asn He Pro Val Arg Thr Pro Leu Gly Gin 

580 585 590 

AGG ATC CGC CGG GCC TTC ATC GCC GAG GAG GGG TGG CTA TTG GTG GCC 1824 

Arg He Arg Arg Ala Phe He Ala Glu Glu Gly Trp Leu Leu Val Ala 
595 600 605 

CTG GAC TAT AGC CAG ATA GAG CTC AGG GTG CTG GCC CAC CTC TGC GGC 1872 

Leu Asp Tyr Ser Gin He Glu Leu Arg Val Leu Ala His Leu Ser Gly 
610 615 620 

GAC GAG AAC CTG ATC CGG GTC TTC CAG GAG GGG CGG GAC ATC CAC ACG 1920 

Asp Glu Asn Leu He Arg Val Phe Gin Glu Gly Arg Asp He His Thr 
^25 630 635 640 

GAG ACC GCC AGC TGG ATG TTC GGC GTC CCC CGG GAG GCC GTG GAC CCC 1968 

Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro 

6^5 650 655 

CTG ATG CGC CGG GCG GCC AAG ACC ATC AAC TTC GGG GTC CTC TAG GGC 2016 

Leu Met Arg Arg Ala Ala Lys Thr He Asn Phe Gly Val Leu lyr Gly 

660 665 670 

ATG TCG GCC CAC CGC CTC TCC CAG GAG CTA GCC ATC CCT TAG GAG GAG 2064 

Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala He Pro Tyr Glu Glu 
675 680 685 
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GCC CAG GCC TTG ATT GAG CGC TAG TTT GAG AGC TTC CGC AAG GTG GGG 

Ala Gin Ala Phe lie Glu Arg Tyr Phe Gin Ser Phe Pro Lys Val Arg 
690 695 700 

GCC TGG ATT GAG AAG ACC CTG GAG GAG GGG AGG AGG CGG GGG TAG GTG 

Ala Trp lie Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val 
705 710 715 720 

GAG ACC CTC TTC GGC CGC CGC CGC TAG GTG CCA GAG CTA GAG GCC CGG 

Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg 

725 730 735 



2112 



2160 



2208 



GTG AAG AGC GTG CGG GAG GCG GCC GAG CGC ATG GCC TTC AAG ATG CGC 



2256 



Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn liet Pro 

740 745 750 



GTC CAG GGC ACC GCC GCC GAG CTC ATG AAG CTG GCT ATG GTG AAG CTG 



2304 



Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu 
755 760 765 



TTC CCC AGG CTG GAG GAA ATG GGG GCC AGG ATG CTC CTT CAG GTC CAC 



2352 



Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gin Val His 
770 775 780 

GAG GAG CTG GTC CTC GAG GCC CCA AAA GAG AGG GCG GAG GCC GTG GCC 

Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala 
785 790 795 800 



2400 



CGG CTG GCC AAG GAG GTC ATG GAG GGG GTG TAT CCC CTG GCC GTG CCC 



2448 



Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro 

805 810 815 

CTG GAG GTG GAG GTG GGG ATA GGG GAG GAC TGG CTC TCC GCC AAG GAG 

Leu Glu Val Glu Val Gly lie Gly Glu Asp Trp Leu Ser Ala Lys Glu 

820 825 830 



2496 



TGA 



2499 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 832 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(11) MOLECULE TYPE: protein 

(xl) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Met Arg Gly Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
15 10 15 

Val Asp Gly His His Leu Ala Tyr Arg Thr Phe His Ala Leu Lys Glv 

20 25 30 

Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 45 

Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Asp Ala Val He Val 
50 55 60 

Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Glv 
65 70 75 80 

Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin Leu 

85 90 95 

Ala Leu He Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu 

100 105 110 

Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala Lys Lys 
115 120 125 

Ala Glu Lys Glu Gly Tyr Glu Val Arg He Leu Thr Ala Asp Lys Asp 
130 135 140 

Leu Tyr Gin Leu Leu Ser Asp Arg He His Val Leu His Pro Glu Glv 
1^5 150 155 160 

Tyr Leu Il^f Thr Pro Ala Trp Leu Trp Glu Lys Tyr Gly Leu Arg Pro 

165 170 175 

Asp Gin Trp Ala Asp Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn 

180 185 190 

Leu Pro Gly Val Lys Gly He Gly Glu Lys Thr Ala Arg Lys Leu Leu 
195 200 205 

Glu Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu 
210 215 220 

Lys Pro Ala He Arg Glu Lys He Leu Ala His Met Asp Asp Leu Lys 
225 230 235 240 

Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu Pro Leu Glu Val 

245 250 255 

Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg Glu Arg Leu Arg Ala Phe 

260 265 270 
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Leu Glu Arg Leu Glu Fhe Gly Ser Leu Leu His Glu Fhe Gly Leu Leu 
275 280 285 

Glu Ser Pro Lys Ala Leu Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly 
290 295 300 

Ala Phe Val Gly Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp 
305 310 315 320 

Leu Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro 

325 330 335 

Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu 

340 345 350 

Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro 
355 360 365 

Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn 
370 375 380 

Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu 
385 390 395 400 

Glu Ala Gly Glu Arg Ala Ala Leu Ser Glu Arg Leu Phe Ala Asn Leu 

405 410 415 

Trp Gly Arg Leu Glu Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu 

420 425 430 

Val Glu Arg Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly 
435 440 445 

Val Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala 
450 455 460 

Glu Glu lie Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Cly His 
465 470 475 480 

Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu Phe Asp 

485 490 *'-95 

Glu Leu Gly Leu Pro Ala lie Gly Lys Thr Glu Lys Thr Gly Lys Arg 

500 505 510 

Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro lie 
515 520 525 

Val Glu Lys lie Leu Gin Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr 
530 535 540 

Tyr lie Asp Pro Leu Pro Asp Leu lie His Pro Arg Thr Gly Arg Leu 
545 550 555 560 

His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser 

565 570 575 
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Ser Asp Pro Asn Leu Gin Asn lie Pro Val Arg Thr Pro Leu Gly Gin 

580 585 590 

Arg lie Arg Arg Ala Phe lie Ala Glu Glu Gly Trp Leu Leu Val Ala 
595 600 605 

Leu Asp Tyr Ser Gin He Glu Leu Arg Val Leu Ala His Leu Ser Gly 
610 615 620 

Asp Glu Asn Leu He Arg Val Phe Gin Glu Gly Arg Asp He His Thr 
625 630 635 640 

Glu Thr Ala Ser Trp Met Phe Gly Val Pro Arg Glu Ala Val Asp Pro 

645 650 655 

Leu Met Arg Arg Ala Ala Lys Thr He Asn Phe Gly Val Leu Tyr Gly 

660 665 670 

Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala He Pro Tyr Glu Glu 
675 680 685 

Ala Gin Ala Phe He Glu Arg Tyr Phe Gin Ser Phe Pro Lys Val Arg 
690 695 700 

Ala Trp He Glu Lys Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val 
705 710 715 720 

Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg 

725 730 735 

Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro 

740 745 750 

Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu 
755 760 765 

Phe Pro Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gin Val His 
770 775 780 

Asp Glu Leu Val Leu Glu Ala Pro Lys Glu Arg Ala Glu Ala Val Ala 
785 790 795 800 

Arg Leu Ala Lys Glu Val Met Glu Gly Val Tyr Pro Leu Ala Val Pro 

805 810 815 

Leu Glu Val Glu Val Gly He Gly Glu Asp Trp Leu Ser Ala Lys Glu 

820 825 830 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2682 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Thermo toga maritima 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1, ,2679 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATG GCG AGA CTA TTT CTC TTT GAT GGA ACT GCT CTG GCC TAC AGA GCG 48 

Met Ala Arg Leu Phe Leu Phe Asp Gly Thr Ala Leu Ala T3rr Arg Ala 
15 10 15 

TAC TAT GCG CTC GAT AGA TCG CTT TCT ACT TCC ACC GGC ATT CCC ACA 96 

Xyr Tyr Ala Leu Asp Arg Ser Leu Ser Thr Ser Thr Gly lie Pro Thr 

20 25 30 

AAC GCC ACA TAC GGT GTG GCG AGG ATG CTG GTG AGA TTC ATC AAA GAC 144 

Asn Ala Thr Tyr Gly Val Ala Arg Met Leu Val Arg Phe lie Lys Asp 
35 40 45 

CAT ATC ATT GTC GGA AAA GAC TAC GTT GCT GTG GCT TTC GAC AAA AAA 192 

His lie lie Val Gly Lys Asp Tyr Val Ala Val Ala Phe Asp Lys Lys 
50 55 60 

GCT GCC ACC TTC AGA CAC AAG CTC CTC GAG ACT TAC AAG GCT CAA AGA 240 

Ala Ala Thr Phe Arg His Lys Leu Leu Glu Thr Tyr Lys Ala Gin Arg 
65 70 75 80 

CCA AAG ACT CCG GAT CTC CTG ATT CAG CAG CTT CCG TAC ATA AAG AAG 288 

Pro Lys Thr Pro Asp Leu Leu lie Gin Gin Leu Pro Tyr lie Lys Lys 

85 90 95 

CTG GTC GAA GCC CTT GGA ATG AAA GTG CTG GAG GTA GAA GGA TAC GAA 336 

Leu Val Glu Ala Leu Gly Met Lys Val Leu Glu Val Glu Gly Tyr Glu 

100 105 110 
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GCG GAG GAT ATA* ATT GCC ACT CTG GCT GTG AAG GGG CTT CCG CTT TTT 384 

Ala Asp Asp lie lie Ala Thr Leu Ala Val Lys Gly Leu Pro Leu Phe 
115 120 125 

GAT GAA ATA TTC ATA GTG ACC GGA GAT AAA GAG ATG CTT GAG CTT GTG 432 

Asp Glu lie Phe lie Val Thr Gly Asp Lys Asp Met Leu Gin Leu Val 
130 135 140 

AAC GAA AAG ATC AAG GTG TGG GGA ATG GTA AAA GGG ATA TCC GAT CTG 480 

Asn Glu Lys lie Lys Val Trp Arg He Val Lys Gly He Ser Asp Leu 
1^5 150 155 160 

GAA CTT TAG GAT GCG GAG AAG GTG AAG GAA AAA TAG GGT GTT GAA GCC 528 

Glu Leu Tyr Asp Ala Gin Lys Val Lys Glu Lys Tyx Gly Val Glu Pro 

165 170 175 

GAG CAG ATC CCG GAT CTT CTG GCT GTA ACC GGA GAT GAA ATA GAG AAC 576 

Gin Gin He Pro Asp Leu Leu Ala Leu Thr Gly Asp Glu He Asp Asn 

180 185 190 

ATC GCC GGT GTA ACT GGG ATA GGT GAA AAG ACT GCT GTT CAG CTT GTA 624 

He Pro Gly Val Thr Gly He Gly Glu Lys Thr Ala Val Gin Leu Leu 
195 200 205 

GAG AAG TAG AAA GAC CTC GAA GAC ATA CTG AAT CAT GTT CGC GAA CTT 672 

Glu Lys Tyr Lys Asp Leu Glu Asp He Leu Asn His Val Arg Glu Leu 
210 215 220 

CCT GAA AAG GTG AGA AAA GCC CTG CTT GGA GAC AGA GAA AAC GCC ATT 720 

Pro Gin Lys Val Arg Lys Ala Leu Leu Arg Asp Arg Glu Asn Ala He 
225 230 235 240 

CTC AGC AAA AAG CTG GCG ATT CTG GAA ACA AAC GTT CCG ATT GAA ATA 768 

Leu Ser Lys Lys Leu Ala He Leu Glu Thr Asn Val Pro He Glu He 

245 250 255 

AAC TGG GAA GAA CTT CGC TAG CAG GGC TAG GAC AGA GAG AAA CTC TTA 816 

Asn Trp Glu Glu Leu Arg Tyr Gin Gly Tyr Asp Arg Glu Lys Leu Leu 

260 265 270 

CCA CTT TTG AAA GAA CTG GAA TTC GGA TCC ATC ATG AAG GAA CTT GAA 864 

Pro Leu Leu Lys Glu Leu Glu Phe Ala Ser He Met Lys Glu leu Gin 
275 280 285 
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CTG TAG GAA GAG TCC GAA GCC GTT GGA TAG AGA ATA GTG AAA G/iC CTA 912 

Leu Tyr Glu Glu Ser Glu Pro Val Gly Tyr Arg lie Val Lys Asp Leu 
290 295 300 

GTG GAA TTT GAA AAA CTC ATA GAG AAA CTG AGA GAA TCG COT TCG TTC 960 

Val Glu Fhe Glu Lys Leu lie Glu Lys Leu Arg Glu Ser Pro Ser Phe 
305 310 315 320 

GCC ATA GAT CTT GAG ACG TCT TCC CTC GAT CCT TTC GAC TGC GAG ATT 1008 

Ala lie Asp Leu Glu Thr Ser Ser Leu Asp Pro Phe Asp Cys Asp lie 

325 330 335 

GTC GGT ATC TCT GTG TCT TTC AAA CCA AAG GAA GCG TAG TAG ATA CCA 1056 

Val Gly lie Ser Val Ser Phe Lys Pro Lys Glu Ala Tyr Tyr lie Pro 

340 345 350 

CTC CAT CAT AGA AAC GCC GAG AAC CTG GAC GAA AAA GAG GTT CTG AAA 1104 

Leu His His Arg Asn Ala Glh Asn Leu Asp Glu Lys Glu Val Leu Lys 
355 360 365 

AAG CTC AAA GAA ATT CTG GAG GAC GCC GGA GCA AAG ATC GTT GGT GAG 1152 

Lys Leu Lys Glu lie Leu Glu Asp Pro Gly Ala Lys lie Val Gly Gin 
370 375 380 

AAT TTG AAA TTC GAT TAG AAG GTG TTG ATG GTG AAG GGT GTT GAA CCT 1200 

Asn Leu Lys Phe Asp Tyr Lys Val Leu Met Val Lys Gly Val Glu Pro 
385 390 395 400 

GTT CCT CCT TAC TTC GAC ACG ATG ATA GCG GCT TAG CTT CTT GAG CCG 1248 

Val Pro Pro Tyr Phe Asp Thr Met lie Ala Ala Tyr Leu Leu Glu Pro 

405 410 415 

AAC GAA AAG AAG TTC AAT CTG GAC GAT CTC GCA TTG AAA TTT CTT GGA 1296 

Asn Glu Lys Lys Phe Asn Leu Asp Asp Leu Ala Leu Lys Phe Leu Gly 

420 425 430 

TAC AAA ATG AGA TCT TAC GAA GAG CTC ATG TCC TTC TCT TTT CCG CTG 1344 

Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser Phe Ser Phe Pro Leu 
435 440 445 

TTT GGT TTC AGT TTT GCC GAT GTT CCT GTA GAA AAA GCA GCG AAC TAC 1392 

Phe Gly Phe Ser Phe Ala Asp Val Pro Val Glu Lys Ala Ala Asn Tyr 
450 455 460 
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TCC TGT GAA GAT GCA GAG ATC AGO TAG AGA CTT TAG AAG ACC CTG AGG 1440 

Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu Tyr Lys Thr Leu Ser 
^65 470 475 480 

TEA AAA CTC GAG GAG GCA GAT GTG GAA AAG GTG TTG TAG AAG ATA GAA 1488 

Leu Lys Leu His Glu Ala Asp Leu Glu Asn Val Phe Tyr Lys He Glu 

485 490 495 

ATG GCG CTT GTG AAG GTG CTT GCA CGG ATG GAA CTG AAG GGT GTG TAT 1536 

Met Pro Leu Val Asn Val Leu Ala Arg Met Glu Leu Asn Gly Val Tyr 

500 505 510 

GTG GAC ACA GAG TTC CTG AAG AAA CTC TCA GAA GAG TAG GGA AAA AAA 1584 

Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu Glu Tyr Glv Lvs Lvs 
515 520 525 

CTC GAA GAA CTG GCA GAG GAA ATA TAG AGG ATA GGT GGA GAG GCG TTC 1632 

Leu Glu Glu Leu Ala Glu Glu He Tyr Arg He Ala Gly Glu Pro Phe 
530 535 540 

AAG ATA AAG TCA GCG AAG GAG GTT TCA AGG ATC CTT TTT GAA AAA CTC 1680 

Asn He Asn Ser Pro Lys Gin Val Ser Arg He Leu Phe Glu Lys Leu 
545 550 555 560 

GGC ATA AAA CCA GGT GGT AAA AGG AGG AAA AGG GGA GAC TAT TCA ACA 1728 

Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr Gly Asp Tyr Ser Thr 

565 570 575 

GGC ATA GAA GTC CTC GAG GAA CTT GCC GGT GAA CAC GAA ATC ATT CCT 1776 

Arg He Glu Val Leu Glu Glu Leu Ala Gly Glu His Glu He He Pro 

580 585 590 

CTG ATT CTT GAA TAG AGA AAG ATA CAG AAA TTG AAA TCA ACC TAG ATA 1824 

Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu Lys Ser Thr Tyr He 
595 600 605 

GAC GCT CTT GCC AAG ATG GTC AAC CCA AAG ACC GGA AGG ATT CAT GCT 1872 

Asp Ala Leu Pro Lys Met Val Asn Pro Lys Thr Gly Arg He His Ala 
610 615 620 

TCT TTC AAT CAA AGG GGG ACT GCC ACT GGA AGA CTT AGG AGC AGC GAT 1920 

Ser Phe Asn Gin Thr Gly "Thr Ala Thr Gly Arg Leu Ser Ser Ser Asn 
625 630 635 640 
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CCC AAT CTT CAG AAC CTC CCG AGO AAA AGT GAA GAG GGA AAA GAA ATC 1968 

Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys Glu lie 

645 650 655 

AGG AAA GCG ATA GTT CCT CAG GAT CCA AAC TGG TGG ATC GTC AGT GCC 2016 

Arg Lys Ala lie Val Pro Gin Asp Pro Asn Trp Trp lie Val Ser Ala 

660 665 670 

GAC TAG TCC CAA ATA GAA CTG AGG ATC CTC GCC CAT CTC AGT GGT GAT 2064 

Asp Tyr Ser Gin lie Glu Leu Arg lie Leu Ala His Leu Ser Uly Asp 
675 680 685 

GAG AAT CTT TTG AGG GCA TTC GAA GAG GGC ATC GAC GTC CAG ACT CTA 2112 

Glu Asn Leu Leu Arg Ala Phe Glu Glu Gly lie Asp Val His Thr Leu 
690 695 700 

ACA GCT TCC AGA ATA TTC AAC GTG AAA GCC GAA GAA GTA ACC GAA GAA 2160 

Thr Ala Ser Arg lie Phe Asn Val Lys Pro Glu Glu Val Thr Glu Glu 
705 710 715 720 

ATG CGC CGC GCT GGT AAA ATG GTT AAT TTT TCC ATC ATA TAG GCT GTA 2208 

Met Arg Arg Ala Gly Lys Met Val Asn Phe Ser lie lie Tyr Gly Val 

725 730 735 

ACA CCT TAG GGT CTG TCT GTG AGG CTT GGA GTA CCT GTG AAA GAA GCA 2256 

Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly Val Pro Val Lys Glu Ala 

740 745 750 

GAA AAG ATG ATC GTC AAC TAG TTC GTC CTC TAG CCA AAG GTG CGC GAT 2304 

Glu Lys Met lie Val Asn Tyr Phe Val Leu Tyr Pro Lys Val Arg Asp 
755 760 765 

TAG ATT CAG AGG GTC GTA TGG GAA GCG AAA GAA AAA GGC TAT GTT AGA 2352 

Tyr lie Gin Arg Val Val Ser Glu Ala Lys Glu Lys Gly Tyr Val Arg 
770 775 780 

ACG CTG TTT GGA AGA AAA AGA GAC ATA CCA CAG CTC ATG GCC CCG GAC 2400 

Thr Leu Phe Gly Arg Lys Arg Asp lie Pro Gin Leu Met Ala Arg Asp 
785 790 795 800 

AGG AAC ACA CAG GCT GAA GGA GAA CGA ATT GCC ATA AAC ACT GCC ATA 2448 

Arg Asn Thr Gin Ala Glu Gly Glu Arg He Ala He Asn Thr Iro lie 

805 810 815 
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CAG GGT ACA GCA GCG GAT ATA ATA AAG CTG GOT ATG ATA GAA ATA GAG 2496 

Gin Gly Thr Ala Ala Asp He He Lys Leu Ala Met He Glu He Asp 

820 825 830 

AGG GAA CTG AAA GAA AGA AAA ATG AGA TCG AAG ATG ATC ATA GAG GTC 2544 

Arg Glu Leu Lys Glu Arg Lys Met Arg Ser Lys Met He He Gin Val 
835 840 845 

CAC GAG GAA CTG GTT TTT GAA GTG CCC AAT GAG GAA AAG GAG GCG CTC 2592 

His Asp Glu Leu Val Phe Glu Val Pro Asn Glu Glu Lys Asp Ala Leu 
850 855 860 

GTC GAG CTG GTG AAA GAG AGA ATG ACG AAT GTG GTA AAG CTT TCA GTG 2640 

Val Glu Leu Val Lys Asp Arg Met Thr Asn Val Val Lys Leu Ser Val 
865 870 875 880 

CCG CTC GAA GTG GAT GTA ACC ATC GGC AAA ACA TGG TCG TGA 2682 

Pro Leu Glu Val Asp Val Thr He Gly Lys Thr Trp Ser 

885 890 



(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 893 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ala Arg Leu Phe Leu Phe Asp Gly Thr Ala Leu Ala Tyr Arg Ala 
^5 10 15 

Tyr Tyr Ala Leu Asp Arg Ser Leu Ser Thr Ser Thr Gly He Pro Thr 

20 25 30 

Asn Ala Thr Tyr Gly Val Ala Arg Met Leu Val Arg Phe He Lys Asp 
35 40 45 

His He He Val Gly Lys Asp Tyr Val Ala Val Ala Phe Asp Lys Lys 
50 55 60 

Ala Ala Thr Phe Arg His Lys Leu Leu Glu Thr Tyr Lys Ala Gin Are 
65 70 75 80 

Pro Lys Thr Pro Asp Leu Leu He Gin Gin Leu Pro Tyr He Lys Lys 

85 90 95 
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Leu Val Glu Ala Leu Gly Met Lys Val Leu Glu Val Glu Gly Tyr Glu 

100 105 110 

Ala Asp Asp lie lie Ala Thr Leu Ala Val Lys Gly Leu Pro Leu Phe 
115 120 125 

Asp Glu lie Phe lie Val Thr Gly Asp Lys Asp Met Leu Gin Leu Val 
130 135 140 

Asn Glu Lys lie Lys Val Trp Arg lie Val Lys Gly lie Ser Asp Leu 
145 150 155 160 

Glu Leu Tyr Asp Ala Gin Lys Val Lys Glu Lys Tyv Gly Val Glu Pro 

165 170 175 

Gin Gin lie Pro Asp Leu Leu Ala Leu Thr Gly Asp Glu lie Asp Asn 

180 185 190 

lie Pro Gly Val Thr Gly lie Gly Glu Lys Thr Ala Val Gin Leu Leu 
195 200 205 

Glu Lys Tyr Lys Asp Leu Glu Asp lie Leu Asn His Val Arg Glu Leu 
210 215 220 

Pro Gin Lys Val Arg Lys Ala Leu Leu Arg Asp Arg Glu Asn Ala lie 
225 230 235 240 

Leu Ser Lys Lys Leu Ala lie Leu Glu Thr Asn Val Pro lie Glu lie 

245 250 255 

Asn Trp Glu Glu Leu Arg Tyr Gin Gly Tyr Asp Arg Glu Lys Leu Leu 

260 265 270 

Pro Leu Leu Lys Glu Leu Glu Phe Ala Ser lie Met Lys Glu Leu Gin 
275 280 285 

Leu Tyr Glu Glu Ser Glu Pro Val Gly Tyr Arg lie Val Lys /.sp Leu 
290 295 300 

Val Glu Phe Glu Lys Leu lie Glu Lys Leu Arg Glu Ser Pro Ser Phe 
305 310 315 320 

Ala lie Asp Leu Glu Thr Ser Ser Leu Asp Pro Phe Asp Cys Asp He 

325 330 335 

Val Gly He Ser Val Ser Phe Lys Pro Lys Glu Ala Tyr Tyr He Pro 

340 345 350 

Leu His His Arg Asn Ala Gin Asn Leu Asp Glu Lys Glu Val Leu Lys 
355 360 365 

Lys Leu Lys Glu He Leu Glu Asp Pro Gly Ala Lys He Val Gly Gin 
370 375 380 

Asn Leu Lys Phe Asp Tyr Lys Val Leu Met Val Lys Gly Val Glu Pro 
385 390 395 400 
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Val Pro Pro Tyr Phe Asp Thr Met lie Ala Ala Tjrr Leu Leu Glu Pro 

405 410 415 

Asn Glu Lys Lys Phe Asn Leu Asp Asp Leu Ala Leu Lys Phe Leu Gly 

420 425 430 

Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser Phe Ser Phe Pro Leu 
435 440 445 

Phe Gly Phe Ser Phe Ala Asp Val Pro Val Glu Lys Ala Ala Asn Tyr 
450 455 460 

Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu Tyr Lys Thr Leu Ser 
465 470 475 480 

Leu Lys Leu His Glu Ala Asp Leu Glu Asn Val Phe Tyr Lys He Glu 

485 490 495 

Met Pro Leu Val Asn Val Leu Ala Arg Met Glu Leu Asn Gly Val Tyr 

500 505 510 

Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys Lys 
515 520 525 

Leu Glu Glu Leu Ala Glu Glu He Tyr Arg He Ala Gly Glu Pro Phe 
530 535 540 

Asn He Asn Ser Pro Lys Gin Val Ser Arg He Leu Phe Glu Lys Leu 

550 555 560 

Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr Gly Asp Tyr Ser Thr 

565 570 575 

Arg He Glu Val Leu Glu Glu Leu Ala Gly Glu His Glu He He Pro 

580 585 590 

Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu Lvs Ser Thr Tyr He 
595 600 ' 605 

Asp Ala Leu Pro Lys Met Val Asn Pro Lys Thr Gly Arg He His Ala 
610 615 620 

Ser Phe Asn Gin Thr Gly Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 
625 630 635 640 

Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys Glu He 

6^5 650 655 

Arg Lys Ala He Val Pro Gin Asp Pro Asn Trp Trp He Val Ser Ala 

660 665 670 

Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala His Leu Ser Gly Asp 
675 680 685 

Glu Asn Leu Leu Arg Ala Phe Glu Glu Gly He Asp Val His Thr Leu 
690 695 700 
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Thr Ala Ser Arg lie Phe Asn Val Lys Pro Glu Glu Val Thr Glu Glu 
705 710 715 720 

Met Arg Arg Ala Gly Lys Met Val Asn Phe Ser lie lie Tyr Gly Val 

725 730 735 

Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly Val Pro Val Lys Glu Ala 

740 745 750 

Glu Lys Met lie Val Asn Tyr Phe Val Leu Tyr Pro Lys Val Arg Asp 
755 760 765 

Tyr lie Gin Arg Val Val Ser Glu Ala Lys Glu Lys Gly Tyr Val Arg 
770 775 780 

Thr Leu Phe Gly Arg Lys Arg Asp lie Pro Gin Leu Met Ala Arg Asp 
785 790 795 800 

Arg Asn Thr Gin Ala Glu Gly Glu Arg lie Ala lie Asn Thr Pro lie 

805 810 815 

Gin Gly Thr Ala Ala Asp lie lie Lys Leu Ala Met lie Glu lie Asp 

820 825 830 

Arg Glu Leu Lys Glu Arg Lys Met Arg Ser Lys Met lie lie Gin Val 
835 840 845 

His Asp Glu Leu Val Phe Glu Val Pro Asn Glu Glu Lys Asp Ala Leu 
850 855 860 

Val Glu Leu Val Lys Asp Arg Met Thr Asn Val Val Lys Leu Ser Val 
865 870 875 880 



Pro Leu Glu Val Asp Val Thr lie Gly Lys Thr Trp Ser 

885 890 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2493 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Thermus species spsl7 



wo 92/06200 PCT/US91/07035 



-124- 



(Ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..2490 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATG CTG CCC CTC TTT GAG CCC AAG GGC CGG GTC CTC CTG GTG GAC GGC 48 

Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu Val Asp Gly 
1 5 10 15 

CAC GAC CTG GCC TAG GGC ACC TTT TTC GCC CTC AAG GGC CTC ACC ACC 96 

His His Leu Ala Tyr Arg Thr Phe Phe Ala Leu Lys Gly Leu Thr Thr 

20 25 30 

AGC CGG GGC GAG CCC GTG CAG GCG GTT TAT GGC TTC GCC AAA AGC CTC 144 

Ser Arg Gly Glu Pro Val Gin Ala Val T3rr Gly Phe Ala Lys Ser Leu 
35 40 45 

CTC AAG GCC CTG AAG GAG GAT GGG GAG GTG GCC ATC GTG GTC TTT GAC 192 

Leu Lys Ala Leu Lys Glu Asp Gly Glu Val Ala He Val Val Phe Asp 
50 55 60 

GCC AAG GCC CCC TCC TTC CGC CAC GAG GCC TAG GAG GCC TAG AAG GCG 240 

Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Glu Ala Tyr Lys Ala 
65 70 75 80 

GGC CGG GCC CCC ACC GCG GAG GAC TTT CCC CGG CAG CTC GCC CTC ATC 288 

Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin Leu Ala Leu He 

85 90 95 

AAG GAG CTG GTG GAC CTT TTG GGC CTC GTG CGC CTT GAG GTC GCG GGC 336 

Lys Glu Leu Val Asp Leu Leu Gly Leu Val Arg Leu Glu Val Pro Gly 

100 105 110 

TTT GAG GCG GAC GAT GTC CTC GCC ACC CTG GCC AAG AAG GCA GAA AGG 384 

Phe Glu Ala Asp Asp Val Leu Ala Thr Leu Ala Lys Lys Ala Glu Arg 
115 120 125 

GAG GGG TAG GAG GTG CGC ATC CTG AGC GCG GAC CGC GAC CTC TAC CAG 432 

Glu Gly Tyr Glu Val Arg He Leu Ser Ala Asp Arg Asp Leu Vyr Gin 
130 135 140 

CTC CTT TCC GAC CGG ATC CAC CTC CTC CAC CCC GAG GGG GAG GTC CTG 480 

Leu Leu Ser Asp Arg He His Leu Leu His Pro Glu Gly Glu Val Leu 

150 155 160 
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ACC CCC GGG TGG CTC GAG GAG CGC TAG GGC CTC TCC CCG GAG AGG TGG 528 

Thr Pro Gly Trp Leu Gin Glu Arg Tjrr Gly Leu Ser Pro Glu Arg Trp 

165 170 175 

GTG GAG TAG GGG GCC CTG GTG GGG GAG CCT TCG GAG AAC CTC CCC GGG 576 

Val Glu Tyr Arg Ala Leu Val Gly Asp Pro Ser Asp Asn Leu Pro Gly 

180 185 190 

GTG CCC GGC ATC GGG GAG AAG ACC GCC CTG AAG CTC CTG AAG GAG TGG 624 

Val Pro Gly lie Gly Glu Lys Thr Ala Leu Lys Leu Leu Lys Glu Trp 
195 200 205 

GGT AGC CTG GAA GCC ATT CTA AAG AAC CTG GAG GAG GTG AAG CCG GAA 672 

Gly Ser Leu Glu Ala lie Leu Lys Asn Leu Asp Gin Val Lys Pro Glu 
210 215 220 

AGG GTG CGC GAG GCC ATC GGG AAT AAC CTG GAT AAG CTC GAG ATG TCC 720 

Arg Val Arg Glu Ala lie Arg Asn Asn Leu Asp Lys Leu Gin Met Ser 
225 230 235 240 

CTG GAG CTT TCC CGC CTC CGC ACC GAC CTC CCC CTG GAG GTG GAG TTC 768 

Leu Glu Leu Ser Arg Leu Arg Thr Asp Leu Pro Leu Glu Val Asp Phe 

245 250 255 

GCC AAG AGG CGG GAG CCC GAC TGG GAG GGG CTT AAG GCC TTT Tl'G GAG 816 

Ala Lys Arg Arg Glu Pro Asp Trp Glu Gly Leu Lys Ala Phe Leu Glu 

260 265 270 

CGG CTT GAG TTC GGA AGC CTC CTC CAC GAG TTC GGC CTT CTG GAG GCC 864 

Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu Clu Ala 
275 280 285 

CCC AAG GAG GGG GAG GAG GCC CCC TGG CCC CCG CCT GGA GGG GCC TTT 912 

Pro Lys Glu Ala Glu Glu Ala Pro Trp Pro Pro Pro Gly Gly Ala Phe 
290 295 300 

TTC GGC TTC CTC CTC TCC CGC CCC GAG CCC ATG TGG GCG GAG CTT TTG 960 

Leu Gly Phe Leu Leu Ser Arg Pro Glu Pro Met Trp Ala Glu Leu Leu 
305 310 315 320 

GCC CTG GCG GGG GCC AAG GAG GGG CGG GTC CAT CGG GCG GAA GAC CCC 1008 

Ala Leu Ala Gly Ala Lys Glu Gly Arg Val His Arg Ala Glu /^p Pro 

325 330 335 
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GTG GGG GCC CTA AAG GAG CTG AAG GAG ATC CGG GGC CTC CTC GCC AAG 1056 

Val Gly Ala Leu Lys Asp Leu Lys Glu He Arg Gly Leu Leu Ala Lys 

340 345 350 

GAC CTC TCG GTC CTG GCC CTG A6G GAG GGC CGG GAG ATC CCG CCG GGG 1104 

Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Arg Glu He Pro Pro Glv 
355 360 365 

GAC GAC CCC ATG CTC CTC GCC TAC CTC CTG GAC CCG GGG AAC ACC AAC 1152 

Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Gly Asn Thr Asn 
370 375 380 

CCC GAG GGG GTG GCC CGG CGG TAC GGG GGG GAG TGG AAG GAG GAC GCC 1200 

Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Lys Glu ^p Ala 
385 390 395 40O 

GCC GCC CGG GCC CTC CTT TCG GAA AGG CTC TGG CAG GCC CTT TAC CCC 1248 

Ala Ala Arg Ala Leu Leu Ser Glu Arg Leu Trp Gin Ala Leu Tyr Pro 

^05 410 415 

CGG GTG GCG GAG GAG GAA AGG CTC CTT TGG CTC TAG CGG GAG GTG GAG 1296 

Arg Val Ala Glu Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu Val Glu 

^20 425 430 

CGG CCC CTC GCC CAG GTC CTC GCC CAC ATG GAG GCC ACG GGG GTG CGG 1344 

Arg Pro Leu Ala Gin Val Leu Ala His Met Glu Ala Thr Glv Val Are 
435 440 445 

CTG GAT GTG CCC TAC CTG GAG GCC CTT TCC CAG GAG GTG GCC riT GAG 1392 

Leu Asp Val Pro Tyr Leu Glu Ala Leu Ser Gin Glu Val Ala The Glu 
450 455 450 

CTG GAG CGC CTC GAG GCC GAG GTC CAC CGC CTG GCG GGC CAC CCC TTC 1440 

Leu Glu Arg Leu Glu Ala Glu Val His Arg Leu Ala Gly His Pro Phe 
465 470 475 

AAC CTG AAC TCT AGG GAC CAG CTG GAG CGG GTC CTC TTT GAC GAG CTC 1488 

Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu Phe Asp Glu Leu 

485 490 495 

GGC CTA CCC CCC ATC GGC AAG ACG GAG AAG ACG GGC AAG CGC TCC ACC 1536 

Gly Leu Pro Pro He Gly Lys Thr Glu Lys Thr Gly Lys Arg Ser Thr 

500 505 510 
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AGC GCC GCC GTC CTG GAG CTC TTA AGG GAG GCC CAC CCC ATC GTG GGG 1584 

Ser Ala Ala Val Leu Glu Leu Leu Arg Glu Ala His Pro lie Val Gly 
515 520 525 

GGG ATC CTG GAG TAG CGG GAG CTC ATG AAG CTC AAG AGC ACC TAG ATA 1632 

Arg lie Leu Glu Tyr Arg Glu Leu Met Lys Leu Lys Ser Thr Tyr lie 
530 535 540 

GAC CCC CTC CCC AGG CTG GTC CAC CCC AAA ACC GGC CGG CTC CAC ACC 1680 

Asp Pro Leu Pro Arg Leu Val His Pro Lys Thr Gly Arg Leu His Thr 
545 550 555 560 

CGC TTC AAC GAG AGG GCC ACC GCC ACG GGC CGC CTC TCC AGC TCC GAC 1728 

Arg Fhe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 

565 570 575 

CCC AAC CTG GAG AAC ATC CCC GTG CGC ACC CCC TTA GGC CAG CGC ATC 1776 

Pro Asn Leu Gin Asn lie Pro Val Arg Thr Pro Leu Gly Gin Arg lie 

580 585 590 

CGC AAG GCC TTC ATT GCC GAG GAG GGC CAT CTC CTG GTG GCC CTG GAC 1824 

Arg Lys Ala Phe lie Ala Glu Glu Gly His Leu Leu Val Ala Leu Asp 
595 600 605 

TAT AGC CAG ATC GAG CTC CGG GTC CTC GCC CAC CTC TCG GGG GAC GAG 1872 

Tyr Ser Gin lie Glu Leu Arg Val Leu Ala His Leu Ser Gly Asp Glu 
610 615 620 

AAC CTC ATC CGG GTC TTC CGG GAA GGG AAG GAC ATC CAC ACC GAG ACC 1920 

Asn Leu lie Arg Val Phe Arg Glu Gly Lys Asp lie His Thr Glu Thr 
625 630 635 640 

GCC GCC TGG ATG TTC GGC GTG CCC CCC GAG GGG GTG GAC GGG GCC ATG 1968 

Ala Ala Trp Met Phe Gly Val Pro Pro Glu Gly Val Asp Gly Ala Met 

645 650 655 

CGC CGG GCG GCC AAG ACG GTG AAC TTC GGG GTG CTC TAG GGG ATG TCC 2016 

Arg Arg Ala Ala Lys Thr Val Asn Phe Gly Val Leu Tyr Gly Met Ser 

660 665 670 

GCC CAC CGC CTC TCC CAG GAG CTC TCC ATC CCC TAG GAG GAG GCG GCG 2064 

Ala His Arg Leu Ser Gin Glu Leu Ser lie Pro Tyr Glu Glu Ala Ala 
675 680 685 
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GCC TTC ATC GAG CGC TAG TTC GAG AGC TTC COG AAG GTG GGG GGG TGG 

Ala Phe lie Glu Arg Tyr Phe Gin Ser Phe Pro Lys Val Arg Ala Trp 
690 695 700 

ATG GGG AAA AGG TTG GAG GAG GGG GGG AAG AAG GGG TAG GTG GAG AGG 

lie Ala Lys Thr Leu Glu Glu Gly Arg Lys Lys Gly Tyr Val Glu Thr 
705 710 715 720 

GTG TTC GGG GGG GGG CGC TAG GTG GGG GAG CTC AAG GGG GGG GTG AAG 

Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala Arg Val Lys 

725 730 735 

AGC GTG CGG GAG GCG GGG GAG CGC ATG GCC TTC AAG ATG CGC GTG CAG 

Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro Val Gin 

740 745 750 

GGG AGC GGG GCG GAG GTG ATG AAG GTG GCC ATG GTG AAG CTC TTC CGC 

Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu Phe Pro 
755 760 765 

AGG CTC AGG GCC TTG GGG GTT CGC ATC CTC CTC CAG GTG CAC GAG GAG 

Arg Leu Arg Pro Leu Gly Val Arg lie Leu Leu Gin Val His Asp Glu 
770 775 780 

GTG GTC TTG GAG GCG CCA AAG GCG CGG GCG GAG GAG GCG GCC CAG TTG 

Leu Val Leu Glu Ala Pro Lys Ala Arg Ala Glu Glu Ala Ala Gin Leu 
785 790 795 800 

GCC AAG GAG AGC ATG GAA GGG GTT TAG CGC CTC TCC GTG GCC GTG GAG 

Ala Lys Glu Thr Met Glu Gly Val Tyr Pro Leu Ser Val Pro Leu Glu 

805 810 815 

GTG GAG GTG GGG ATG GGG GAG GAC TGG CTT TCC GCC AAG GCC 

Val Glu Val Gly Met Gly Glu Asp Trp Leu Ser Ala Lys Ala 

820 825 830 



2112 



2160 



2208 



2256 



2304 



2352 



2400 



2448 



2490 



TAG 



2493 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 830 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Leu Pro Leu Phe Glu Pro Lys Gly Axg Val Leu Leu Val Asp Gly 
15 10 15 

His His Leu Ala Tyr Arg Thr Phe Phe Ala Leu Lys Gly Leu Thr Thr 

20 25 30 

Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala Lys Ser Leu 
35 AO 45 

Leu Lys Ala Leu Lys Glu Asp Gly Glu Val Ala lie Val Val Phe Asp 
50 55 60 

Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Glu Ala Tyr Lys Ala 
65 70 75 80 

Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin Leu Ala Leu lie 

85 90 95 

Lys Glu Leu Val Asp Leu Leu Gly Leu Val Arg Leu Glu Val Pro Gly 

100 105 110 

Phe Glu Ala Asp Asp Val Leu Ala Thr Leu Ala Lys Lys Ala Glu Arg 
115 120 125 

Glu Gly Tyr Glu Val Arg lie Leu Ser Ala Asp Arg Asp Leu Tyr Gin 
130 135 140 

Leu Leu Ser Asp Arg lie His Leu Leu His Pro Glu Gly Glu Val Leu 
145 150 155 160 

Thr Pro Gly Trp Leu Gin Glu Arg Tyr Gly Leu Ser Pro Glu Arg Trp 

165 170 175 

Val Glu Tyr Arg Ala Leu Val Gly Asp Pro Ser Asp Asn Leu Pro Gly 

180 185 190 

Val Pro Gly lie Gly Glu Lys Thr Ala Leu Lys Leu Leu Lys Glu Trp 
195 200 205 

Gly Ser Leu Glu Ala lie Leu Lys Asn Leu Asp Gin Val Lys Pro Glu 
210 215 220 

Arg Val Arg Glu Ala lie Arg Asn Asn Leu Asp Lys Leu Gin Met Ser 
225 230 235 240 

Leu Glu Leu Ser Arg Leu Arg Thr Asp Leu Pro Leu Glu Val Asp Phe 

245 250 255 

Ala Lys Arg Arg Glu Pro Asp Trp Glu Gly Leu Lys Ala Phe Leu Glu 

260 265 270 
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Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu Glu Ala 
275 280 285 

Pro Lys Glu Ala Glu Glu Ala Pro Trp Pro Pro Pro Gly Gly Ala Phe 
290 295 300 

Leu Gly Phe Leu Leu Ser Arg Pro Glu Pro Met Trp Ala Glu Leu Leu 
305 310 315 320 

Ala Leu Ala Gly Ala Lys Glu Gly Arg Val His Arg Ala Glu Asp Pro 

325 330 335 

Val Gly Ala Leu Lys Asp Leu Lys Glu lie Arg Gly Leu Leu Ala Lys 

340 345 350 

Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Arg Glu lie Pro Pro Gly 
355 360 365 

Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Gly Asn Thr Asn 
370 375 380 

Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp Lys Glu Asp Ala 
385 390 395 400 

Ala Ala Arg Ala Leu Leu Ser Glu Arg Leu Trp Gin Ala Leu Tyr Pro 

405 410 415 

Arg Val Ala Glu Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu Val Glu 

420 425 430 

Arg Pro Leu Ala Gin Val Leu Ala His Met Glu Ala Thr Gly Val Arg 
435 440 445 

Leu Asp Val Pro Tyr Leu Glu Ala Leu Ser Gin Glu Val Ala Phe Glu 
450 455 460 

Leu Glu Arg Leu Glu Ala Glu Val His Arg Leu Ala Gly His Pro Phe 
^65 470 475 480 

Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu Phe Asp Glu Leu 

485 490 495 

Gly Leu Pro Pro He Gly Lys Thr Glu Lys Thr Gly Lys Arg Ser Thr 

500 505 510 

Ser Ala Ala Val Leu Glu Leu Leu Arg Glu Ala His Pro He Val Gly 
515 520 525 

Arg He Leu Glu Tyr Arg Glu Leu Met Lys Leu Lys Ser Thr Tyr He 
530 535 540 

Asp Pro Leu Pro Arg Leu Val His Pro Lys Thr Gly Arg Leu His Thr 
545 550 555 560 

Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 

565 570 575 
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Pro Asn Leu Gin Asn lie Pro Val Arg Thr Pro Leu Gly Gin Arg lie 

580 585 590 

Arg Lys Ala Fhe lie Ala Glu Glu Gly His Leu Leu Val Ala Leu Asp 
595 600 605 

Tyr Ser Gin lie Glu Leu Arg Val Leu Ala His Leu Ser Gly Asp Glu 
610 615 620 

Asn Leu lie Arg Val Phe Arg Glu Gly Lys Asp lie His Thr Glu Thr 
625 630 635 640 

Ala Ala Trp Met Phe Gly Val Pro Pro Glu Gly Val Asp Gly Ala Met 

645 650 655 

Arg Arg Ala Ala Lys Thr Val Asn Phe Gly Val Leu Tyr Gly Met Ser 

660 665 670 

Ala His Arg Leu Ser Gin Glu Leu Ser lie Pro Tyr Glu Glu Ala Ala 
675 680 685 

Ala Phe lie Glu Arg Tyr Phe Gin Ser Phe Pro Lys Val Arg Ala Trp 
690 695 700 

He Ala Lys Thr Leu Glu Glu Gly Arg Lys Lys Gly Tyr Val Glu Thr 
705 710 715 720 

Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala Arg Val Lys 

725 730 735 

Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro Val Gin 

740 745 750 

Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu Phe Pro 
755 760 765 

Arg Leu Arg Pro Leu Gly Val Arg He Leu Leu Gin Val His Asp Glu 
770 775 780 

Leu Val Leu Glu Ala Pro Lys Ala Arg Ala Glu Glu Ala Ala Gin Leu 
785 790 795 800 

Ala Lys Glu Thr Met Glu Gly Val Tyr Pro Leu Ser Val Pro Leu Glu 

805 810 815 

Val Glu Val Gly Met Gly Glu Asp Trp Leu Ser Ala Lys Ala 

820 825 830 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2505 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iy) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Thermus species Z05 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..2502 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
ATG AAG GCG ATG CTT CCG CTC TTT GAA CCC AAA GGC CGG GTT CTC CTG 48 

Met Lys Ala Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
15 10 15 

GTG GAC GGC CAC CAC CTG GCC TAC CGC ACC TTC TTC GCC CTA AAG GGC 96 

Val Asp Gly His His Leu Ala Tyr Arg Thr Phe Phe Ala Leu Lys Gly 

20 25 30 

CTC ACC ACG AGC CGG GGC GAA CCG GTG CAG GCG GTT TAC GGC TTC GCC 144 

Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 45 

AAG AGC CTC CTC AAG GCC CTG AAG GAG GAC GGG TAC AAG GCC GTC TTC 192 

Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Tyr Lys Ala Val Phe 
50 55 60 

GTG GTC TTT GAC GCC AAG GCC CCT TCC TTC CGC CAC GAG GCC TAC GAG 240 

Val Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Glu 
65 70 75 80 

GCC TAC AAG GCA GGC CGC GCC CCG ACC CCC GAG GAC TTC CCC CGG CAG 288 

Ala Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin 

85 90 95 

CTC GCC CTC ATC AAG GAG CTG GTG GAC CTC CTG GGG TTT ACT CGC CTC 336 

Leu Ala Leu He Lys Glu Leu Val Asp Leu Leu Gly Phe Thr Arg Leu 

100 105 110 
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GAG GTT CCG GGC TTT GAG GCG GAC GAG GTC CTC GCC ACC CTG GCC AAG 384 

Glu Val Fro Gly Phe Glu Ala Asp Asp Val Leu Ala Thr Leu Ala Lys 
115 120 125 

AAG GCG GAA AGG GAG GGG TAG GAG GTG CGC ATC CTC ACC GCC GAC CGG 432 

• Lys Ala Glu Arg Glu Gly Tyr Glu Val Arg lie Leu Thr Ala Asp Arg 

130 135 140 

GAC CTT TAG GAG CTC GTC tCC GAC CGC GTC GCC GTC CTC CAC GCC GAG 480 

Asp Leu Tyr Gin Leu Val Ser Asp Arg Val Ala Val Leu His Pro Glu 
145 150 155 160 

GGC CAC CTC ATC ACC CCG GAG TGG CTT TGG GAG AAG TAG GGC CTT AAG 528 

Gly His Leu lie Thr Pro Glu Trp Leu Trp Glu Lys T3rr Gly Leu Lys 

165 170 175 

CCG GAG CAG TGG GTG GAC TTC CGC GCC CTC GTG GGG GAC GCC TCC GAC 576 

Pro Glu Gin Trp Val Asp Phe Arg Ala Leu Val Gly Asp Pro Ser Asp 

180 185 190 

AAG CTC CCG GGG GTC AAG GGC ATC GGG GAG AAG ACC GCC CTC AAG CTC 624 

Asn Leu Pro Gly Val Lys Gly lie Gly Glu Lys Thr Ala Leu Lys Leu 
195 200 205 

CTC AAG GAG TGG GGA AGC CTG GAA AAT ATC CTC AAG AAC CTG GAC CGG 672 

Leu Lys Glu Trp Gly Ser Leu Glu Asn lie Leu Lys Asn Leu Asp Arg 
210 215 220 

GTG AAG CCG GAA AGC GTC CGG GAA AGG ATC AAG GCC CAC CTG GAA GAC 720 

Val Lys Pro Glu Ser Val Arg Glu Arg lie Lys Ala His Leu Clu Asp 
225 230 235 240 

CTT AAG CTC TCC TTG GAG CTT TCC CGG GTG CGC TCG GAC CTC GCC CTG 768 

Leu Lys Leu Ser Leu Glu Leu Ser Arg Val Arg Ser Asp Leu Pro Leu 

245 250 255 

• GAG GTG GAC TTC GCC CGG AGG CGG GAG CCT GAC CGG GAA GGG CTT CGG 816 

Glu Val Asp Phe Ala Arg Arg Arg Glu Pro Asp Arg Glu Gly *Leu Arg 

260 265 270 

GCC TTT TTG GAG CGC TTG GAG TTC GGC AGC CTC CTC CAC GAG TTC GGC 864 

Ala Phe Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly 
275 280 285 
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CTC CTC GAG GCC CCC GCC CCC CTG GAG GAG GCC CCC TGG CCC CCG CCG 912 

Leu Leu Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp Pro Pro Pro 
290 295 300 

GAA GGG GCC TTC GTG GGC TTC GTC CTC TCC CGC CCC GAG CCC ATG TGG 960 

Glu Gly Ala Phe Val Gly Phe Val Leu Ser Arg Pro Glu Pro Met Trp 
305 310 315 320 

GCG GAG CTT AAA GCC CTG GCC GCC TGC AAG GAG GGC CGG GTG CAC CGG 1008 

Ala Glu Leu Lys Ala Leu Ala Ala Cys Lys Glu Gly Arg Val His Arg 

325 330 335 

GCA AAG GAG CCC TTG GCG GGG CTA AAG GAC CTC AAG GAG GTC CGA GGC 1056 

Ala Lys Asp Pro Leu Ala Gly Leu Lys Asp Leu Lys Glu Val Arg Gly 

340 345 350 

CTC CTC GCC AAG GAC CTC GCC GTT TTG GCC CTT CGC GAG GGG CTG GAC 1104 

Leu Leu Ala Lys Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Leu Asp 
355 360 365 

CTC GCG CCT TCG GAC GAC CCC ATG CTC CTC GCC TAC CTC CTG GAC CCC 1152 

Leu Ala Pro Ser Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro 
370 375 380 

TCC AAC ACC ACC CCC GAG GGG GTG GCC CGG CGC TAC GGG GGG GAG TGG 1200 

Ser Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp 
385 390 395 400 

ACG GAG GAC GCC GCC CAC CGG GCC CTC CTC GCC GAG CGG CTC CAG CAA 1248 

Thr Glu Asp Ala Ala His Arg Ala Leu Leu Ala Glu Arg Leu Gin Gin 

405 410 415 

AAC CTC TTG GAA CGC CTC AAG GGA GAG GAA AAG CTC CTT TGG CTC TAC 1296 

Asn Leu Leu Glu Arg Leu Lys Gly Glu Glu Lys Leu Leu Trp Leu Tyr 

420 425 430 

CAA GAG GTG GAA AAG CCC CTC TCC CGG GTC CTG GCC CAC ATG GAG GCC 1344 

Gin Glu Val Glu Lys Pro Leu Ser Arg Val Leu Ala His Met Glu Ala 
435 440 445 

ACC GGG GTA AGG CTG GAC GTG GCC TAT CTA AAG GCC CTT TCC CTG GAG 1392 

Thr Gly Val Arg Leu Asp Val Ala Tyr Leu Lys Ala Leu Ser Leu Glu 
450 455 460 
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CTT GCG GAG GAG ATT CGC CGC CTC GAG GAG GAG GTC TTC CGC CTG GCG 1440 

Leu Ala Glu Glu lie Arg Arg Leu Glu Glu Glu Val Phe Arg Leu Ala 
465 470 475 480 

GGC CAC CGC TTC AAC CTG AAC TCC CGT GAG GAG CTA GAG CGG GTG CTC 1488 

» Gly His Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu 

485 490 495 

TTT GAG GAG CTT AGG CTT CCC GCC CTG GGC AAG ACG CAA AAG ACG GGG 1536 

Phe Asp Glu Leu Arg Leu Pro Ala Leu Gly Lys Thr Gin Lys Thr Gly 

500 505 510 

AAG CGC TCC ACC AGC GCC GCG GTG CTG GAG GCC CTC AGG GAG GCC CAC 1584 

Lys Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His 
515 520 525 

CCC ATC GTG GAG AAG ATC CTC GAG CAC CGG GAG CTC ACC AAG CTC AAG 1632 

Pro lie Val Glu Lys lie Leu Gin His Arg Glu Leu Thr Lys Leu Lys 
530 535 540 

AAC ACC TAG GTG GAG CCC CTC CGG GGG CTC GTG GAG CGG AGG ACG GGC 1680 

Asn Thr Tyr Val Asp Pro Leu Pro Gly Leu Val His Pro Arg Thr Gly 
545 550 555 560 

CGC CTC CAC ACG CGG TTC AAC CAG ACA GGC ACG GCC ACG GGA AGG CTC 1728 

Arg Leu His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu 

565 570 575 

TCT AGC TCC GAG CCC AAC CTG CAG AAC ATC CGC ATC CGC ACG GCG TTG 1776 

Ser Ser Ser Asp Pro Asn Leu Gin Asn lie Pro lie Arg Thr Pro Leu 

580 585 590 

GGC CAG AGG ATC CGC CGG GCG TTG GTG GCC GAG GCG GGA TGG GCG TTG 1824 

Gly Gin Arg lie Arg Arg Ala Phe Val Ala Glu Ala Gly Trp Ala Leu 
595 600 605 

GTG GGG CTG GAG TAT AGG CAG ATA GAG CTG GGG GTG GTC GCC CAC GTG 1872 

Val Ala Leu Asp Tyr Ser Gin lie Glu Leu Arg Val Leu Ala His Leu 
610 615 620 

TCC GGG GAG GAG AAC CTG ATC AGG GTG TTG GAG GAG GGG AAG GAC ATC 1920 

Ser Gly Asp Glu Asn Leu lie Arg Val Phe Gin Glu Gly Lys Asp lie 
625 630 635 640 
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CAC ACC CAG ACC GCA AGO TGG ATG TTC GGC GTC TCC CCG GAG GGC GTG 1968 

His Thr Gin Thr Ala Ser Trp Met Phe Gly Val Ser Pro Glu Ala Val 

645 650 655 

GAG GGC GTG ATG GGC CGG GCG GGC AAG ACG GTG AAC TTG GGC GTC GTC 2016 

Asp Pro Leu Met Arg Arg Ala Ala Lys Thr Val Asn Phe Gly Val Leu 

660 665 670 

TAG GGC ATG TCC GCG CAT AGG GTC TCC CAG GAG CTT GCG ATC GGC TAG 2064 

Tyr Gly Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala He Pro Tyr 
675 680 685 

GAG GAG GCG GTG GCG TTT ATA GAG GGC TAC TTC CAA AGC TTC CCG AAG 2112 

Glu Glu Ala Val Ala Phe He Glu Arg Tyr Phe Gin Ser Phe Pro Lys 
690 695 700 

GTG CGG GCG TGG ATA GAA AAG ACC CTG GAG GAG GGG AGG AAG CGG GGC 2160 

Val Arg Ala Trp He Glu Lys Thr Leu Glu Glu Gly Arg Lys Arg Gly 
705 710 715 720 

TAC GTG GAA ACC CTG TTC GGA AGA AGG" CGC TAC GTG CCG GAG GTC AAC 2208 

Tyr Val Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn 

725 730 735 

GCC CGG GTG AAG AGC GTC AGG GAG GCC GCG GAG CGC ATG GCC TTC AAC 2256 

Ala Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn 

740 745 750 

ATG CGC GTC CAG GGC ACC GCC GCC GAC CTC ATG AAG CTC GCC ATG GTG 2304 

Met Pro Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val 
755 760 765 

AAG CTC TTC GCC CAC CTC CGG GAG ATG GGG GCC CGC ATG CTC CTC GAG 2352 

Lys Leu Phe Pro His Leu Arg Glu Met Gly Ala Arg Met Leu Leu Gin 
770 ' 775 780 

GTC CAC GAC GAG CTC CTC CTG GAG GCC CCC CAA GCG CGG GCC GaG GAG 2400 

Val His Asp Glu Leu Leu Leu Glu Ala Pro Gin Ala Arg Ala Glu Glu 
785 790 795 800 

GTG GCG GCT TTG GCC AAG GAG GCC ATG GAG AAG GCC TAT CCC CTC GCC 2448 

Val Ala Ala Leu Ala Lys Glu Ala Met Glu Lys Ala Tyr Pro Leu Ala 

805 810 815 
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GTG CCC CTG GAG GTG GAG GTG GGG ATC GGG GAG GAG TGG CTT TCC GCC 2496 

Val Pro Leu Glu Val Glu Val Gly lie Gly Glu Asp Trp Leu Ser Ala 

820 825 830 

AAG GGG TGA 2505 

Lys Gly 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 834 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Lys Ala Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
15 10 15 

Val Asp Gly His His Leu Ala Tyr Arg Thr Phe Phe Ala Leu Lys Gly 

20 25 30 

Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 45 

Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Tyr Lys Ala Val Phe 
50 55 60 

Val Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Glu 
65 70 75 80 

Ala Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin 

85 90 95 

Leu Ala Leu lie Lys Glu Leu Val Asp Leu Leu Gly Phe Thr Arg Leu 

100 105 110 

Glu Val Pro Gly Phe Glu Ala Asp Asp Val Leu Ala Thr Leu /.la Lys 
115 120 125 

Lys Ala Glu Arg Glu Gly Tyr Glu Val Arg lie Leu Thr Ala Asp Arg 
130 135 140 

Asp Leu Tyr Gin Leu Val Ser Asp Arg Val Ala Val Leu His Pro Glu 
145 150 155 160 

Gly His Leu lie Thr Pro Glu Trp Leu Trp Glu Lys Tyr Gly Leu Lys 

165 170 175 
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Pro Glu Gin Trp Val Asp Phe Arg Ala Leu Val Gly Asp Pro Ser Asp 

180 185 190 

Asn Leu Pro Gly Val Lys Gly He Gly Glu Lys Thr Ala Leu Lys Leu 
195 200 205 

Leu Lys Glu Trp Gly Ser Leu Glu Asn He Leu Lys Asn Leu Asp Are 
210 215 220 

Val Lys Pro Glu Ser Val Arg Glu Arg He Lys Ala His Leu Glu Asp 
225 230 235 240 

Leu Lys Leu Ser Leu Glu Leu Ser Arg Val Arg Ser Asp Leu Pro Leu 

245 250 255 

Glu Val Asp Phe Ala Arg Arg Arg Glu Pro Asp Arg Glu Gly Leu Arg 

260 265 270 

Ala Phe Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly 
275 280 285 

Leu Leu Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp Pro Pro Pro 
290 295 300 

Glu Gly Ala Phe Val Gly Phe Val Leu Ser Arg Pro Glu Pro Met Trp 
305 310 315 320 

Ala Glu Leu Lys Ala Leu Ala Ala Cys Lys Glu Gly Arg Val His Arg 

325 330 335 

Ala Lys Asp Pro Leu Ala Gly Leu Lys Asp Leu Lys Glu Val Arg Gly 

340 345 350 

Leu Leu Ala Lys Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Leu Asp 
355 360 365 

Leu Ala Pro Ser Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro 
370 375 380 

Ser Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp 
385 390 395 400 

Thr Glu Asp Ala Ala His Arg Ala Leu Leu Ala Glu Arg Leu Gin Gin 

^05 410 415 

Asn Leu Leu Glu Arg Leu Lys Gly Glu Glu Lys Leu Leu Trp Leu Tyr 

^20 425 430 

Gin Glu Val Glu Lys Pro Leu Ser Arg Val Leu Ala His Met Glu Ala 
^35 440 445 

Thr Gly Val Arg Leu Asp Val Ala Tyr Leu Lys Ala Leu Ser Leu Glu 
^50 455 460 

Leu Ala Glu Glu He Arg Arg Leu Glu Glu Glu Val Phe Arg Leu Ala 
^65 470 475 480 
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Gly His Pro Fhe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu 

485 490 495 

Phe Asp Glu Leu Arg Leu Pro Ala Leu Gly Lys Thr Gin Lys Thr Gly 

500 505 510 

Lys Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His 
515 520 525 

Pro lie Val Glu Lys lie Leu Gin His Arg Glu Leu Thr Lys Leu Lys 
530 535 540 

Asn Thr Tyr Val Asp Pro Leu Pro Gly Leu Val His Pro Arg Thr Gly 
545 550 555 560 

Arg Leu His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu 

565 570 575 

Ser Ser Ser Asp Pro Asn Leu Gin Asn lie Fro lie Arg Thr Pro Leu 

580 585 590 

Gly Gin Arg lie Arg Arg Ala Phe Val Ala Glu Ala Gly Trp Ala Leu 
595 600 605 

Val Ala Leu Asp Tyr Ser Gin lie Glu Leu Arg Val Leu Ala His Leu 
610 615 620 

Ser Gly Asp Glu Asn Leu lie Arg Val Phe Gin Glu Gly Lys Asp lie 
625 630 635 640 

His Thr Gin Thr Ala Ser Trp Met Phe Gly Val Ser Pro Glu Ala Val 

645 650 655 

Asp Pro Leu Met Arg Arg Ala Ala Lys Thr Val Asn Phe Gly Val Leu 

660 665 670 

Tyx Gly Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala lie Pro Tyr 
675 680 685 

Glu Glu Ala Val Ala Phe lie Glu Arg Tyr Phe Gin Ser Phe Pro Lys 
690 695 700 

Val Arg Ala Trp lie Glu Lys Thr Leu Glu Glu Gly Arg Lys Arg Gly 
705 710 715 720 

Tyr Val Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn 

725 730 735 

Ala Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn 

740 745 750 

Met Pro Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val 
755 760 765 

Lys Leu Phe Pro His Leu Arg Glu Met Gly Ala Arg Met Leu Leu Gin 
770 775 780 
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Val His Asp Glu Leu Leu Leu Glu Ala Pro Gin Ala Arg Ala Glu Glu 
785 790 795 800 

Val Ala Ala Leu Ala Lys Glu Ala Met Glu Lys Ala Tyr Pro Leu Ala 

805 810 815 

Val Pro Leu Glu Val Glu Val Gly He Gly Glu Asp Trp Leu Ser Ala 

820 825 830 

Lys Gly 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2505 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Thermus thermophilus 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..2502 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATG GAG GCG ATG CTT CCG CTC TTT GAA CCC AAA GGC CGG GTC CTC CTG 48 

Met Glu Ala Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
15 10 15 

GTG GAC GGC CAC CAC CTG GCC TAG CGC ACC TTC TTC GCC CTG AAG GGC 96 

Val Asp Gly His His Leu Ala Tyr Arg Thr Phe Phe Ala Leu Lys Gly 

20 25 30 

CTC ACC ACG AGC CGG GGC GAA CCG GTG CAG GCG GTC TAG GGC TTC GCC 144 

Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 45 

AAG AGC CTC CTC AAG GCC CTG AAG GAG GAC GGG TAG AAG GCC GTC TTC 192 

Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Tyr Lys Ala Val Phe 
50 55 60 
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GTG GTC TTT GAG GCC AAG GCC CCC TCC TTC CGC CAC GAG GCC TAG GAG 240 

Val Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Glu 
65 70 75 80 

GCC TAG AAG GCG GGG AGG GCC CCG ACC CCC GAG GAG TTC CCC CGG GAG 288 

Ala Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin 

85 90 95 

CTC GCC CTC ATC AAG GAG CTG GTG GAG CTC CTG GGG TTT ACC CGC CTC 336 

Leu Ala Leu lie Lys Glu Leu Val Asp Leu Leu Gly Phe Thr Arg Leu 

100 105 110 

GAG GTC CCC GGG TAG GAG GCG GAG GAG GTT CTC GCC ACC CTG GCC AAG 384 

Glu Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Thr Leu Ala Lys 
115 120 125 

AAG GCG GAA AAG GAG GGG TAC GAG GTG CGC ATC CTC ACC GCC GAC CGC 432 

Lys Ala Glu Lys Glu Gly Tyr Glu Val Arg lie Leu Thr Ala Asp Arg 
130 135 140 

GAC CTC TAC CAA CTC GTC TCC GAC CGC GTC GCC GTC CTC CAC CCC GAG 480 

Asp Leu Tyr Gin Leu Val Ser Asp Arg Val Ala Val Leu His Pro Glu 
145 150 155 160 

GGC CAC CTC ATC ACC CCG GAG TGG CTT TGG GAG AAG TAC GGC CTC AGG 528 

Gly His Leu lie Thr Pro Glu Trp Leu Trp Glu Lys Tyr Gly Leu Arg 

165 170 175 

CCG GAG GAG TGG GTG GAC TTC CGC GCC CTC GTG GGG GAC CCC TCC GAC 576 

Pro Glu Gin Trp Val Asp Phe Arg Ala Leu Val Gly Asp Pro Ser Asp 

180 185 190 

AAC CTC CCC GGG GTC AAG GGC ATC GGG GAG AAG ACC GCC CTC AAG CTC 624 

Asn Leu Pro Gly Val Lys Gly lie Gly Glu Lys Thr Ala Leu Lys Leu 
195 200 205 

CTC AAG GAG TGG GGA ACC CTG GAA AAC CTC CTC AAG AAC CTG GAC CGG 672 

Leu Lys Glu Trp Gly Ser Leu Glu Asn Leu Leu Lys Asn Leu Asp Arg 
210 215 220 

GTA AAG CCA GAA AAC GTC CGG GAG AAG ATC AAG GCC CAC CTG GAA GAC 720 

Val Lys Pro Glu Asn Val Arg Glu Lys lie Lys Ala His Leu Clu Asp 
225 230 235 240 
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CTC AGG CTC TCC TTG GAG CTC TCC CGG GTG CGC ACC GAG CTC CCC CTG 768 

Leu Arg Leu Ser Leu Glu Leu Ser Arg Val Arg Thr Asp Leu Pro Leu 

245 250 255 

GAG GTG GAG CTC GCC GAG GGG CGG GAG CCC GAC CGG GAG GGG CTT AGG 816 

Glu Val Asp Leu Ala Gin Gly Arg Glu Pro Asp Arg Glu Gly Leu Arg 

260 265 270 

GCC TTG CTG GAG AGG CTG GAG TTG GGC AGC CTC CTC CAC GAG TTC GGC 864 

Ala Phe Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly 
275 280 285 

CTC CTG GAG GCC CCC GCC CCC CTG GAG GAG GCC CCC TGG CCC CGG CCG 912 

Leu Leu Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp Pro Pro Pro 
290 295 300 

GAA GGG GCC TTC GTG GGC TTC GTC CTC TCC CGC CCC GAG CCC ATG TGG 960 

Glu Gly Ala Phe Val Gly Phe Val Leu Ser Arg Pro Glu Pro Ket Trp 
305 310 315 320 

GCG GAG CTT AAA GCC CTG GCC GCC TGC AGG GAC GGC CGG GTG CAC CGG 1008 

Ala Glu Leu Lys Ala Leu Ala Ala Cys Arg Asp Gly Arg Val His Arg 

325 330 335 

GGA GCA GAC CCC TTG GCG GGG CTA AAG GAC CTC AAG GAG GTC CGG GGC 1056 

Ala Ala Asp Pro Leu Ala Gly Leu Lys Asp Leu Lys Glu Val Arg Gly 

340 345 350 

CTC CTC GCC AAG GAC CTC GCC GTC TTG GCC TCG AGG GAG GGG CTA GAC 1104 

Leu Leu Ala Lys Asp Leu Ala Val Leu Ala Ser Arg Glu Gly Leu Asp 
355 360 365 

CTC GTG CCC GGG GAC GAC CCC ATG CTC CTC GCC TAG CTC CTG GAC CCC 1152 

Leu Val Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro 
370 375 380 

TCC AAG ACC ACC CCC GAG GGG GTG GCG CGG CGC TAG GGG GGG GAG TGG 1200 

Ser Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp 
385 390 395 400 

ACG GAG GAC GCC GCC CAC CGG GCC CTC CTC TCG GAG AGG CTC CAT CGG 1248 

Thr Glu Asp Ala Ala His- Arg Ala Leu Leu Ser Glu Arg Leu His Arg 

^05 410 i.l5 
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AAC CTC CTT AAG CGC CTC GAG GGG GAG GAG AAG CTC CTT TGG CTC TAG 1296 

Asn Leu Leu Lys Arg Leu Glu Gly Glu Glu Lys Leu Leu Tirp Leu Tyr 

420 425 430 

CAC GAG GTG GAA AAG CCC CTC TCC CGG GTC CTG GCC CAC ATG GAG GCC 1344 

His Glu Val Glu Lys Pro Leu Ser Arg Val Leu Ala His Met Glu Ala 
435 440 445 

ACC GGG GTA CGG CTG GAG GTG GCC TAG CTT CAG GCC CTT TCC CTG GAG 1392 

Thr Gly Val Arg Leu Asp Val Ala Tyr Leu Gin Ala Leu Ser Leu Glu 
450 455 460 

CTT GCG GAG GAG ATC CGC CGC CTC GAG GAG GAG GTC TTC CGC TTG GCG 1440 

Leu Ala Glu Glu lie Arg Arg Leu Glu Glu Glu Val Phe Arg Leu Ala 
465 470 475 480 

GGC CAC CCC TTC AAC CTC AAC TCC CGG GAC CAG CTG GAA AGG GTG CTC 1488 

Gly His Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu 

485 490 495 

TTT GAC GAG CTT AGG CTT CCC GCC TTG GGG AAG ACG GAA AAG ACA GGC 1536 

Phe Asp Glu Leu Arg Leu Pro Ala Leu Gly Lys Thr Gin Lys Thr Gly 

500 505 510 

AAG CGC TCC ACC AGC GCC GCG GTG CTG GAG GCC CTA CGG GAG GCC CAC 1584 

Lys Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His 
515 520 525 

CCC ATC GTG GAG AAG ATC CTC CAG CAC CGG GAG CTC ACC AAG CTC AAG 1632 

Pro lie Val Glu Lys lie Leu Gin His Arg Glu Leu Thr Lys Leu Lys 
530 535 540 

AAC ACC TAG GTG GAC CCC CTC CCA AGC CTC GTC CAC GCG AGG ACG GGC 1680 

Asn Thr Tyr Val Asp Pro Leu Pro Ser Leu Val His Pro Arg Thr Gly 
545 550 555 560 

CGC CTC CAC ACC CGC TTC AAC CAG ACG GCC ACG GCC ACG GGG AGG CTT 1728 

Arg Leu His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu 

565 570 575 

ACT AGC TCC GAC CCC AAC CTG CAG AAC ATC CCC GTC CGC ACC CCC TTG 1776 

Ser Ser Ser Asp Pro Asn Leu Gin Asn lie Pro Val Arg Thr Iro Leu 

580 585 590 
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GGC GAG AGG ATC CGC CGG GCC TTC GTG GCC GAG GCG GGT TGG GCG TTG 1824 

Gly Gin Arg He Arg Arg Ala Phe Val Ala Glu Ala Gly Trp Ala Leu 
595 600 605 

GTG GCC CTG GAG TAT AGC GAG ATA GAG CTC CGC GTC CTC GCC CAC CTC 1872 

Val Ala Leu Asp Tyr Ser Gin lie Glu Leu Arg Val Leu Ala His Leu 
610 615 620 

TCC GGG GAG GAA AAC CTG ATC AGG GTC TTC CAG GAG GGG AAG GAC ATC 1920 

Ser Gly Asp Glu Asn Leu He Arg Val Phe Gin Glu Gly Lys Asp He 
625 630 635 640 

CAC ACC CAG ACC GCA AGC TGG ATG TTC GGC GTC CCC GCG GAG GCC GTG 1968 

His Thr Gin Thr Ala Ser Trp Met Phe Gly Val Pro Pro Glu Ala Val 

645 650 655 

GAC CCC CTG ATG CGC CGG GCG GCC AAG ACG GTG AAC TTC GGC GTC CTC 2016 

Asp Pro Leu Met Arg Arg Ala Ala Lys Thr Val Asn Phe Gly Val Leu 

660 665 670 

TAG GGC ATG TCC GCC CAT AGG CTC TCC CAG GAG CTT GCC ATC CCC TAG 2064 

Tyr Gly Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala He Pro Tjrr 
675 680 685 

GAG GAG GCG GTG GCC TTT ATA GAG CGC TAG TTC CAA AGC TTC CCC AAG 2112 

Glu Glu Ala Val Ala Phe He Glu Arg Tyr Phe Gin Ser Phe Pro Lys 
690 695 700 

GTG CGG GCC TGG ATA GAA AAG ACC CTG GAG GAG GGG AGG AAG CGG GGC 2160 

Val Arg Ala Trp He Glu Lys Thr Leu Glu Glu Gly Arg Lys Arg Gly 
705 710 715 720 

TAG GTG GAA ACC CTC TTC GGA AGA AGG CGC TAG GTG CCC GAC CTC AAC 2208 

T3rr Val Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn 

725 730 735 

GCC CGG GTG AAG AGC GTC AGG GAG GCC GCG GAG CGC ATG GCC TTC AAC 2256 

Ala Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn 

740 745 750 

. ATG CCC GTC CAG GGC ACC GCC GCC GAC CTC ATG AAG CTC GCC ATG GTG 2304 

Met Pro Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val 
755 760 765 
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AAG CTC TTC CCC CGC CTC CGG GAG ATG GGG GCC CGC ATG CTC CTC GAG 2352 

Lys Leu Phe Pro Arg Leu Arg Glu Met Gly Ala Arg Met Leu Leu Gin 
770 775 780 

GTC CAC GAG GAG GTC GTC GTG GAG GCC CCC CAA GCG CGG GCC GAG GAG 2400 

Val His Asp Glu Leu Leu Leu Glu Ala Pro Gin Ala Arg Ala Glu Glu 
785 790 795 800 

GTG GCG GCT TTG GCC AAG GAG GCC ATG GAG AAG GCC TAT CCC CTC GCC 2448 

Val Ala Ala Leu Ala Lys Glu Ala Met Glu Lys Ala Tyr Pro Leu Ala 

805 810 815 

GTG CCC GTG GAG GTG GAG GTG GGG ATG GGG GAG GAG TGG CTT TCC GCC 2496 

Val Pro Leu Glu Val Glu Val Gly Met Gly Glu Asp Trp Leu Ser Ala 

820 825 830 

AAG GGT TAG 2505 
Lys Gly 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 834 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Glu Ala Met Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu 
15 10 15 

Val Asp Gly His His Leu Ala Tyr Arg Thr Phe Phe Ala Leu Lys Gly 

20 25 30 

Leu Thr Thr Ser Arg Gly Glu Pro Val Gin Ala Val Tyr Gly Phe Ala 
35 40 45 

Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly Tyr Lys Ala Val Phe 
50 55 60 

Val Val Phe Asp Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Glu 
65 70 75 80 

Ala Tyr Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gin 

85 90 95 
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Leu Ala Leu lie Lys Glu Leu Val Asp Leu Leu Gly Phe Thr Arg Leu 

100 105 no 

Glu Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Thr Leu Ala Lys 
115 120 125 

Lys Ala Glu Lys Glu Gly Tyr Glu Val Arg He Leu Thr Ala Asp Are 
130 135 140 

Asp Leu Tyr Gin Leu Val Ser Asp Arg Val Ala Val Leu His Pro Glu 
1*5 150 155 160 

Gly His Leu He Thr Pro Glu Trp Leu Trp Glu Lys Tyr Gly Leu Arg 

165 170 175 

Pro Glu Gin Trp Val Asp Phe Arg Ala Leu Val Gly Asp Pro Ser Asp 

180 185 190 

Asn Leu Pro Gly Val Lys Gly He Gly Glu Lys Thr Ala Leu Lys Leu 
195 200 205 

Leu Lys Glu Trp Gly Ser Leu Glu Asn Leu Leu Lys Asn Leu Asp Are 
210 215 220 

Val Lys Pro Glu Asn Val Arg Glu Lys He Lys Ala His Leu Glu Asp 
225 230 235 240 

Leu Arg Leu Ser Leu Glu Leu Ser Arg Val Arg Thr Asp Leu Pro Leu 

245 250 255 

Glu Val Asp Leu Ala Gin Gly Arg Glu Pro Asp Arg Glu Gly :^u Are 

260 265 270 

Ala Phe Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly 
275 280 285 

Leu Leu Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp Pro Pro Pro 
290 295 300 

Glu Gly Ala Phe Val Gly Phe Val Leu Ser Arg Pro Glu Pro Met Trp 
305 310 315 320 

Ala Glu Leu Lys Ala Leu Ala Ala Cys Arg Asp Gly Arg Val His Arg 

325 330 

Ala Ala Asp Pro Leu Ala Gly Leu Lys Asp Leu Lys Glu Val Arg Gly 

340 345 350 

Leu Leu Ala Lys Asp Leu Ala Val Leu Ala Ser Arg Glu Gly Leu Asp 
355 360 365 

Leu Val Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro 
370 375 380 

Ser Asn Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp 

390 395 400 
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Thr Glu Asp Ala Ala His Arg Ala Leu Leu Ser Glu Arg Leu His Arg 

405 410 415 

Asn Leu Leu Lys Arg Leu Glu Gly Glu Glu Lys Leu Leu Trp Leu Tyr 

420 425 430 

His Glu Val Glu Lys Pro Leu Ser Arg Val Leu Ala His Met Glu Ala 
435 440 445 

Thr Gly Val Arg Leu Asp Val Ala Tyr Leu Gin Ala Leu Ser Leu Glu 
450 455 460 

Leu Ala Glu Glu lie Arg Arg Leu Glu Glu Glu Val Phe Arg Leu Ala 
465 470 475 480 

Gly His Pro Phe Asn Leu Asn Ser Arg Asp Gin Leu Glu Arg Val Leu 

485 490 495 

Phe Asp Glu Leu Arg Leu Pro Ala Leu Gly Lys Thr Gin Lys Thr Gly 

500 505 510 

Lys Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His 
515 520 525 

Pro lie Val Glu Lys lie Leu Gin His Arg Glu Leu Thr Lys Leu Lys 
530 535 540 

Asn Thr Tyr Val Asp Pro Leu Pro Ser Leu Val His Pro Arg Thr Gly 
545 550 555 560 

Arg Leu His Thr Arg Phe Asn Gin Thr Ala Thr Ala Thr Gly Arg Leu 

565 570 575 

Ser Ser Ser Asp Pro Asn Leu Gin Asn lie Pro Val Arg Thr Pro Leu 

580 585 590 

Gly Gin Arg He Arg Arg Ala Phe Val Ala Glu Ala Gly Trp Ala Leu 
595 600 605 

Val Ala Leu Asp Tyr Ser Gin He Glu Leu Arg Val Leu Ala His Leu 
610 615 620 

Ser Gly Asp Glu Asn Leu He Arg Val Phe Gin Glu Gly Lys Asp He 
625 630 635 640 

His Thr Gin Thr Ala Ser Trp Met Phe Gly Val Pro Pro Glu Ala Val 

645 650 655 

Asp Pro Leu Met Arg Arg Ala Ala Lys Thr Val Asn Phe Gly Val Leu 

660 665 670 

Tyr Gly Met Ser Ala His Arg Leu Ser Gin Glu Leu Ala He Pro Tyr 
675 * 680 685 

Glu Glu Ala Val Ala Phe He Glu Arg Tyr Phe Gin Ser Phe Pro Lys 
690 695 700 
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Val Arg Ala Trp He Glu Lys Thr Leu Glu Glu Gly Are Lys Arc Glv 
705 710 715 720 

Tyr Val Glu Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn 

725 730 735 

Ala Arg Val Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn 

7^0 745 750 

Met Pro Val Gin Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val 
755 760 765 

Lys Leu Phe Pro Arg Leu Arg Glu Met Gly Ala Arg Met Leu Leu Gin 
770 775 780 

Val His Asp Glu Leu Leu Leu Glu Ala Pro Gin Ala Arg Ala Glu Glu 

790 795 800 

Val Ala Ala Leu Ala Lys Glu Ala Met Glu Lys Ala Tyr Pro Leu Ala 

805 810 815 

Val Pro Leu Glu Val Glu Val Gly Met Gly Glu Asp Trp Leu Ser Ala 

820 825 830 

Lys Gly 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2679 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Thermos ipho africanus 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..2676 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
ATG GGA AAG ATG TTT CTA TTT GAT GGA ACT GGA TTA GTA TAC AGA GCA 

Met Gly Lys Met Phe Leu Phe Asp Gly Thr Gly Leu Val Tyr Arg Ala 
15 10 15 
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TTT TAT GCT ATA GAT CAA TCT CTT CAA ACT TCG TCT GGT TTA GAG AGT 96 

Phe Tyr Ala lie Asp Gin Ser Leu Gin Thr Ser Ser Gly Leu His Thr 

20 25 30 

AAT GGT GTA TAG GGA GTT ACT AAA ATG CTT ATA AAA TTT TTA AAA GAA 144 

Asn Ala Val Tyr Gly Leu Thr Lys Met Leu lie Lys Phe Leu Lys Glu 
35 40 45 

CAT ATG AGT ATT GGA AAA GAT GGT TGT GTT TTT GTT TTA GAT TCA AAA 192 

His lie Ser lie Gly Lys Asp Ala Cys Val Phe Val Leu Asp Ser Lys 
50 55 60 

GGT GGT AGG AAA AAA AGA AAG GAT ATT CTT GAA ACA TAT AAA GGA AAT 240 

Gly Gly Ser Lys Lys Arg Lys Asp lie Leu Glu Thr Tyr Lys Ala Asn 
65 70 75 80 

AGG GGA TCA AGG CCT GAT TTA CTT TTA GAG CAA ATT CCA TAT GTA GAA 288 

Arg Pro Ser Thr Pro Asp Leu Leu Leu Glu Gin lie Pro Tyr Val Glu 

85 90 95 

GAA GTT GTT GAT GCT CTT GGA ATA AAA GTT TTA AAA ATA GAA GGC TTT 336 

Glu Leu Val Asp Ala Leu Gly lie Lys Val Leu Lys lie Glu Gly Phe 

100 105 110 

GAA GCT GAT GAG ATT ATT GCT ACG CTT TCT AAA AAA TTT GAA AGT GAT 384 

Glu Ala Asp Asp lie lie Ala Thr Leu Ser Lys Lys Phe Glu Ser Asp 
115 120 125 

TTT GAA AAG GTA AAC ATA ATA ACT GGA GAT AAA GAT CTT TTA CAA CTT 432 

Phe Glu Lys Val Asn lie lie Thr Gly Asp Lys Asp Leu Leu Cln Leu 
130 135 140 

GTT TCT GAT AAG GTT TTT GTT TGG AGA GTA GAA AGA GGA ATA ACA GAT 480 

Val Ser Asp Lys Val Phe Val Trp Arg Val Glu Arg Gly lie Thr Asp 
145 150 155 160 

TTG GTA TTG TAG GAT AGA AAT AAA GTG ATT GAA AAA TAT GGA ATC TAC 528 

Leu Val Leu Tyr Asp Arg Asn Lys Val He Glu Lys Tyr Gly He Tyr 

165 170 175 

CCA GAA CAA TTG AAA GAT TAT TTA TCT CTT GTC GGT GAT CAG ATT GAT 576 

Pro Glu Gin Phe Lys Asp Tyr Leu Ser Leu Val Gly Asp Gin He Asp 

180 185 190 
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AAT ATC CCA GGA GTT AAA GGA ATA GGA AAG AAA ACA GCT GTT TCG CTT 624 

Asn lie Pro Gly Val Lys Gly He Gly Lys Lys Thr Ala Val Ser Leu 
195 200 205 

TTG AAA AAA TAT AAT AGC TTG GAA AAT GTA TTA AAA AAT ATT AAC CTT 672 

Leu Lys Lys Tyr Asn Ser Leu Glu Asn Val Leu Lys Asn He Asn Leu 
210 215 220 

TTG ACG GAA AAA TTA AGA AGG CTT TTG GAA GAT TCA AAG GAA GAT TTG 720 

Leu Thr Glu Lys Leu Arg Arg Leu Leu Glu Asp Ser Lys Glu Asp Leu 
225 230 235 240 

GAA AAA ACT ATA GAA CTT GTG GAG TTG ATA TAT GAT GTA CCA ATG GAT 768 

Gin Lys Ser He Glu Leu Val Glu Leu He Tyr Asp Val Pro Met Asp 

245 250 255 

GTG GAA AAA GAT GAA ATA ATT TAT AGA GGG TAT AAT CCA GAT AAG CTT 816 

Val Glu Lys Asp Glu He He Tyr Arg Gly Tyr Asn Pro Asp Lys Leu 

260 265 270 

TTA AAG GTA TTA AAA AAG TAG GAA TTT TCA TCT ATA ATT AAG GAG TTA 864 

Leu Lys Val Leu Lys Lys Tyr Glu Phe Ser Ser He He Lys Glu Leu 
275 280 285 

AAT TTA CAA GAA AAA TTA GAA AAG GAA TAT ATA CTG GTA GAT AAT GAA 912 

Asn Leu Gin Glu Lys Leu Glu Lys Glu Tyr He Leu Val Asp Asn Glu 
290 295 300 

GAT AAA TTG AAA AAA CTT GCA GAA GAG ATA GAA AAA TAG AAA ACT TTT 960 

Asp Lys Leu Lys Lys Leu Ala Glu Glu He Glu Lys Tyr Lys Thr Phe 
305 310 315 320 

TCA ATT GAT ACG GAA ACA ACT TCA CTT GAT CCA TTT GAA GCT AAA CTG 1008 

Ser He Asp Thr Glu Thr Thr Ser Leu Asp Pro Phe Glu Ala Lys Leu 

325 330 335 

GTT GGG ATC TCT ATT TCC ACA ATG GAA GGG AAG GGG TAT TAT ATT CCG 1056 

Val Gly He Ser He Ser Thr Met Glu Gly Lys Ala Tyr Tyr He Pro 

340 345 350 

GTG TCT CAT TTT GGA GCT AAG AAT ATT TCC AAA ACT TTA ATA GAT AAA 1104 

Val Ser His Phe Gly Ala Lys Asn He Ser Lys Ser Leu He Asp Lys 
355 360 365 
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TTT CTA AAA CAA ATT TTG CAA GAG AAG GAT TAT AAT ATC GTT GGT GAG 1152 

Phe Leu Lys Gin lie Leu Gin Glu Lys Asp Tyr Asn lie Val Gly Gin 
370 375 380 

« 

AAT TTA AAA TTT GAG TAT GAG ATT TTT AAA AGC ATG GGT TTT TCT CCA 1200 

• Asn Leu Lys Phe Asp Tyr Glu lie Phe Lys Ser Met Gly Phe Ser Pro 

385 390 395 400 

AAT GTT CCG CAT TTT GAT ACG ATG ATT GCA GCC TAT GTT TTA AAT CCA 1248 

Asn Val Pro His Phe Asp Thr Met lie Ala Ala Tyr Leu Leu Asn Pro 

405 410 415 

GAT GAA AAA CGT TTT AAT CTT GAA GAG CTA TCC TTA AAA TAT TTA GGT 1296 

Asp Glu Lys Arg Phe Asn Leu Glu Glu Leu Ser Leu Lys Tyr Leu Gly 

420 425 430 

TAT AAA ATG ATC TCG TTT GAT GAA TTA GTA AAT GAA AAT GTA CCA TTG 1344 

Tyr Lys Met lie Ser Phe Asp Glu Leu Val Asn Glu Asn Val Pro Leu 
435 440 445 

TTT GGA AAT GAC TTT TCG TAT GTT CCA CTA GAA AGA GCC GTT G/iG TAT 1392 

Phe Gly Asn Asp Phe Ser Tyr Val Pro Leu Glu Arg Ala Val Glu Tyr 
450 455 460 

TCC TGT GAA GAT GCC GAT GTG ACA TAG AGA ATA TTT AGA AAG C*T GGT 1440 

Ser Cys Glu Asp Ala Asp Val Thr Tyr Arg lie Phe Arg Lys 7^u Gly 
465 470 475 480 

AGG AAG ATA TAT GAA AAT GAG ATG GAA AAG TTG TTT TAG GAA ATT GAG 1488 

Arg Lys lie Tyr Glu Asn Glu Met Glu Lys Leu Phe Tyr Glu lie Glu 

485 490 495 

ATG CCC TTA ATT GAT GTT CTT TCA GAA ATG GAA CTA AAT GGA GTG TAT 1536 

Met Pro Leu lie Asp Val Leu Ser Glu Met Glu Leu Asn Gly Val Tyr 

500 505 510 

' TTT GAT GAG GAA TAT TTA AAA GAA TTA TCA AAA AAA TAT CAA GAA AAA 1584 

Phe Asp Glu Glu Tyr Leu Lys Glu Leu Ser Lys Lys Tyr Gin niu Lys 
515 520 525 

ATG GAT GGA ATT AAG GAA AAA GTT TTT GAG ATA GCT GGT GAA ACT TTC 1632 

Met Asp Gly lie Lys Glu Lys Val Phe Glu lie Ala Gly Glu Thr Phe 
530 535 540 
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AAT TTA. AAC TCT TCA ACT CAA GTA GCA TAT ATA CTA TTT GAA AAA TTA 1680 

Asn Leu Asn Ser Ser Thr Gin Val Ala Tyr He Leu Phe Glu Lvs Leu 
5« 550 555 560 

AAT ATT GOT COT TAG AAA AAA ACA GOG ACT GGT AAG TTT TCA ACT AAT 1728 

Asn He Ala Pro Tyr Lys Lys Thr Ala Thr Gly Lys Phe Ser Thr Asn 

565 570 575 

GCG GAA GTT TTA GAA GAA CTT TCA AAA GAA CAT GAA ATT GCA AAA TTG 1776 

Ala Glu Val Leu Glu Glu Leu Ser Lys Glu His Glu He Ala Lys Leu 

580 585 590 

TTG CTG GAG TAT CGA AAG TAT CAA AAA TTA AAA ACT ACA TAT ATT GAT 1824 

Leu Leu Glu Tyr Arg Lys Tyr Gin Lys Leu Lys Ser Thr Tyr He Asp 
595 600 605 

TCA ATA CCG TTA TCT ATT AAT CGA AAA ACA AAC AGG GTC CAT ACT ACT 1872 

Ser He Pro Leu Ser He Asn Arg Lys Thr Asn Arg Val His Thr Thr 
610 615 620 

TTT CAT CAA ACA GGA ACT TCT ACT GGA AGA TTA ACT ACT TCA AAT CCA 1920 

Phe His Gin Thr Gly Thr Ser Thr Gly Arg Leu. Ser Ser Ser Asn Pro 
625 630 635 640 

AAT TTG CAA AAT CTT CCA ACA AGA AGC GAA GAA GGA AAA GAA ATA AGA 1968 

Asn Leu Gin Asn Leu Pro Thr Arg Ser Glu Glu Gly Lys Glu He Arg 

6*5 650 655 

AAA GCA GTA AGA CCT CAA AGA CAA GAT TGG TGG ATT TTA GGT GCT GAC 2016 

Lys Ala Val Arg Pro Gin Arg Gin Asp Trp Trp He Leu Gly Ala Asp 

660 665 670 

TAT TCT CAG ATA GAA CTA AGG GTT TTA GCG CAT GTA ACT AAA GAT GAA 2064 

Tyr Ser Gin He Glu Leu Arg Val Leu Ala His Val Ser Lys Asp Glu 
675 680 685 

AAT CTA CTT AAA GCA TTT AAA GAA GAT TTA GAT ATT CAT ACA ATT ACT 2112 

Asn Leu Leu Lys Ala Phe Lys Glu Asp Leu Asp He His Thr He Thr 
690 695 700 

GCT GCC AAA ATT TTT GGT GTT TCA GAG ATG TTT GTT ACT GAA CAA ATG 2160 

Ala Ala Lys He Phe Gly Val Ser Glu Met Phe Val Ser Glu Gin Met 
705 710 715 720 
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AGA AGA GTT GGA AAG ATG GTA AAT TTT GCA ATT ATT TAT GGA GTT TCA 2208 

Arg Arg Val Gly Lys Met Val Asn Phe Ala lie lie Tyr Gly Val Ser 

725 730 735 

CCT TAT GGT CTT TCA AAG AGA ATT GGT CTT AGT GTT TCA GAG ACT AAA 2256 

Pro Tyr Gly Leu Ser Lys Arg lie Gly Leu Ser Val Ser Glu Thr Lys 

740 745 750 

AAA ATA ATA GAT AAC TAT TTT AGA TAG TAT AAA GGA GTT TTT GAA TAT 2304 

Lys lie lie Asp Asn Tyr Phe Arg Tyr Tyr Lys Gly Val Phe Glu Tyr 
755 760 765 

TTA AAA AGG ATG AAA GAT GAA GCA AGG AAA AAA GGT TAT GTT AGA ACG 2352 

Leu Lys Arg Met Lys Asp Glu Ala Arg Lys Lys Gly Tyr Val Thr Thr 
770 775 780 

CTT TTT GGA AGG CGC AGA TAT ATT CCA CAG TTA AGA TCG AAA AAT GGT 2400 

Leu Phe Gly Arg Arg Arg Tyr lie Pro Gin Leu Arg Ser Lys Asn Gly 
785 790 795 800 

AAT AGA GTT CAA GAA GGA GAA AGA ATA GCT GTA AAC ACT CCA ATT CAA 2448 

Asn Arg Val Gin Glu Gly Glu Arg lie Ala Val Asn Thr Pro lie Gin 

805 810 815 

GGA ACA GCA GCT GAT ATA ATA AAG ATA GCT ATG ATT AAT ATT CAT AAT 2496 

Gly Thr Ala Ala Asp lie lie Lys lie Ala Met lie Asn lie His Asn 

820 825 830 

AGA TTG AAG AAG GAA AAT CTA CGT TCA AAA ATG ATA TTG CAG GTT CAT 2544 

Arg Leu Lys Lys Glu Asn Leu Arg Ser Lys Met lie Leu Gin ^ al His 
835 840 845 

GAG GAG TTA GTT TTT GAA GTG CCC GAT AAT GAA CTG GAG ATT GTA AAA 2592 

Asp Glu Leu Val Phe Glu Val Fro Asp Asn Glu Leu Glu lie Val Lys 
850 855 860 

GAT TTA GTA AGA GAT GAG ATG GAA AAT GCA GTT AAG CTA GAC GTT CCT 2640 

Asp Leu Val Arg Asp Glu Met Glu Asn Ala Val Lys Leu Asp Val Pro 
865 870 875 880 

TTA AAA GTA GAT GTT TAT TAT GGA AAA GAG TGG GAA TAA 2679 



Leu Lys Val Asp Val Tyr Tyr Gly Lys Glu Trp Glu 

885 890 
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(2) INF0RMA.TION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 892 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Gly Lys Met Phe Leu Phe Asp Gly Thr Gly Leu Val Tyr Are Ala 
15 10 15 

Phe Tyr Ala He Asp Gin Ser Leu Gin Thr Ser Ser Gly Leu His Thr 

20 25 30 

Asn Ala Val Tyr Gly Leu Thr Lys Met Leu He Lys Phe Leu Lys Glu 
35 40 45 

His He Ser He Gly Lys Asp Ala Cys Val Phe Val Leu Asp Ser Lvs 
50 55 60 

Gly Gly Ser Lys Lys Arg Lys Asp He Leu Glu Thr Tyr Lys Ala Asn 
^5 70 75 80 

Arg Pro Ser Thr Pro Asp Leu Leu Leu Glu Gin He Pro Tyr Val Glu 

85 90 95 

Glu Leu Val Asp Ala Leu Gly He Lys Val Leu Lys He Glu Glv Phe 

100 105 110 

Glu Ala Asp Asp He He Ala Thr Leu Ser Lys Lys Phe Glu Ser Asp 
115 120 125 

Phe Glu Lys Val Asn He He Thr Gly Asp Lys Asp Leu Leu Gin Leu 
130 135 140 

Val Ser Asp Lys Val Phe Val Trp Arg Val Glu Arg Gly He Thr Asp 

150 155 160 

Leu Val Leu Tyr Asp Arg Asn Lys Val He Glu Lys Tyr Gly He Tyr 

165 170 175 

Pro Glu Gin Phe Lys Asp Tyr Leu Ser Leu Val Gly Asp Gin He Asp 

180 185 190 

Asn He Pro Gly Val Lys Gly He Gly Lys Lys Thr Ala Val Ser Leu 
195 200 205 

Leu Lys Lys Tyr Asn Ser Leu Glu Asn Val Leu Lys Asn He Asn Leu 
210 215 . 220 

Leu Thr Glu Lys Leu Arg Arg Leu Leu Glu Asp Ser Lys Glu Asp Leu 
225 230 235 240 
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Gln Lys Ser He Glu Leu Val Glu Leu He Tyr Asp Val Pro Met Asp 

245 250 255 

Val Glu Lys Asp Glu He He Tyr Arg Gly Tyr Asn Pro Asp Lys Leu 

260 265 270 

Leu Lys Val Leu Lys Lys Tyr Glu Phe Ser Ser He He Lys Glu Leu 
275 280 285 

Asn Leu Gin Glu Lys Leu Glu Lys Glu Tyr He Leu Val Asp Asn Glu 
290 295 300 

Asp Lys Leu Lys Lys Leu Ala Glu Glu He Glu Lys Tyr Lys Thr Phe 
305 310 315 320 

Ser He Asp Thr Glu Thr Thr Ser Leu Asp Pro Phe Glu Ala Lys Leu 

325 330 335 

Val Gly He Ser He Ser Thr Met Glu Gly Lys Ala Tyr Tyr He Pro 

340 345 350 

Val Ser His Phe Gly Ala Lys Asn He Ser Lys Ser Leu He Asp Lys 
355 360 365 

Phe Leu Lys Gin He Leu Gin Glu Lys Asp Tyr Asn He Val Gly Gin 
370 375 380 

Asn Leu Lys Phe Asp Tyr Glu He Phe Lys Ser Met Gly Phe Ser Pro 
385 390 395 400 

Asn Val Pro His Phe Asp Thr Met He Ala Ala Tyr Leu Leu Asn Pro 

405 410 415 

Asp Glu Lys Arg Phe Asn Leu Glu Glu Leu Ser Leu Lys Tyr Leu Gly 

420 425 430 

Tyr Lys Met He Ser Phe Asp Glu Leu Val Asn Glu Asn Val Pro Leu 
435 440 445 

Phe Gly Asn Asp Phe Ser Tyr Val Pro Leu Glu Arg Ala Val Glu Tyr 
450 455 460 

Ser Cys Glu Asp Ala Asp Val Thr Tyr Arg He Phe Arg Lys I^u Gly 
465 470 475 480 

Arg Lys He Tyr Glu Asn Glu Met Glu Lys Leu Phe Tyr Glu Tie Glu 

485 490 495 

Met Pro Leu He Asp Val Leu Ser Glu Met Glu Leu Asn Gly Val Tyr 

500 505 510 

Phe Asp Glu Glu Tyr Leu Lys Glu Leu Ser Lys Lys Tyr Gin Glu Lys 
515 520 525 

Met Asp Gly He Lys Glu Lys Val Phe Glu He Ala Gly Glu Thr Phe 
530 535 540 
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Asti Leu Asn Ser Ser Thr Gin Val Ala Tyr He Leu Phe Glu Lys Leu 
545 550 555 560 

Asn He Ala Pro Tyr Lys Lys Thr Ala Thr Gly Lys Phe Ser Thr Asn 

565 570 575 

Ala Glu Val Leu Glu Glu Leu Ser Lys Glu His Glu He Ala Lys Leu 

580 585 590 

Leu Leu Glu Tyr Arg Lys Tyr Gin Lys Leu Lys Ser Thr TVr He Asp 
595 600 605 

Ser He Pro Leu Ser He Asn Arg Lys Thr Asn Arg Val His Thr Thr 
610 615 620 

Phe His Gin Thr Gly Thr Ser Thr Gly Arg Leu Ser Ser Ser Asn Pro 
625 630 635 640 

Asn Leu Gin Asn Leu Pro Thr Arg Ser Glu Glu Gly Lys Glu He Arg 

645 650 655 

Lys Ala Val Arg Pro Gin Arg Gin Asp Trp Trp He Leu Gly Ala Asp 

660 665 670 

Tyr Ser Gin He Glu Leu Arg Val Leu Ala His Val Ser Lys Asp Glu 
675 680 685 

Asn Leu Leu Lys Ala Phe Lys Glu Asp Leu Asp He His Thr He Thr 
690 695 700 

Ala Ala Lys He Phe Gly Val Ser Glu Met Phe Val Ser Glu Gin Met 
705 710 715 720 

Arg Arg Val Gly Lys Met Val Asn Phe Ala He He Tyr Gly Val Ser 

725 730 735 

Pro Tyr Gly Leu Ser Lys Arg He Gly Leu Ser Val Ser Glu Thr Lys 

740 745 750 

Lys He He Asp Asn Tyr Phe Arg T3rr Tyr Lys Gly Val Phe Glu Tyr 
755 760 765 

Leu Lys Arg Met Lys Asp Glu Ala Arg Lys Lys Gly Tyr Val Thr Thr 
770 775 ' 780 

Leu Phe Gly Arg Arg Arg Tyr He Pro Gin Leu Arg Ser Lys Asn Gly 
785 790 795 800 

Asn Arg Val Gin Glu Gly Glu Arg He Ala Val Asn Thr Pro He Gin 

805 810 815 

Gly Thr Ala Ala Asp He He Lys He Ala Met He Asn He His Asn 

820 825 830 

Arg Leu Lys Lys Glu Asn Leu Arg Ser Lys Met He Leu Gin Val His 
835 840 845 
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Asp Glu Leu Val Phe Glu Val Pro Asp 
850 855 



Asn Glu Leu Glu lie Val Lys 
860 



Asp Leu Val Arg Asp Glu Met Glu Asn 
865 870 



Ala Val Lys Leu Asp Val Pro 
875 880 



Leu Lys Val Asp Val Tyr Tyr Gly Lys 

885 



Glu Trp Glu 
890 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA probe BW33 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GATCGCTGCG CGTAACCACC ACACCCGCCG CGC 33 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer BW37 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GCGCTAGGGC GCTGGCAAGT GTAGCGGTCA 30 



wo 92/06200 



PCT/US91/07035 



•158- 



(2) INFORMATION FOR SEQ ID N0:I5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: YES 

(iv) ANTI- SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..4 

(D) OTHER INFORMATION: /label- Xaa 
/note- "Xaa - Val or Thr" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 

Ala Xaa Tyr Gly 
1 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

His Glu Ala Tyr Gly 
1 5 



(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 5 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 
(v) FRAGMENT TYPE: internal 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

His Glu Ala Tyr Glu 
1 5 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(v) FRAGMENT TYPE: internal 



(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1. .4 

(D) OTHER INFORMATION: /label« Xaa 
/note- "Xaa - Leu or He" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Xaa Leu Glu Thr 
1 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY; linear 
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(ii) MOLECULE TYPE: peptide 
(ill) HYPOTHETICAL; NO 
(iv) ANTI- SENSE: NO 
(v) FRAGMENT TYPE: internal 



(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..7 

(D) OTHER INFORMATION: /label- Xaa 
/note- "Xaa - Leu or lie" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Xaa Leu Glu Thr Tyr Lys Ala 
1 5 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(v) FRAGMENT TYPE: internal 



(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1. .7 

(D) OTHER INFORMATION: /label- Xaal-4 

/note- "Xaal - He or Leu or Ala; Xaa2-4, each - 
any amino acid" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Xaa Xaa Xaa Xaa Tyx Lys Ala 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 21: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer MK61 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 
AGGACTACAA CTGCCACACA CC 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer RAGl 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
CGAGGCGCGC CAGCCCCAGG AGATCTACCA GCTCCTTG 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer DG29 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
AGCTTATGTC TCCAAAAGCT 

20 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer DG30 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
AGCTTTTGGA GACATA 

16 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer PLIO 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GGCGTACCTT TGTCTCACGG GCAAC 

25 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA primer FL63 

(iii) HYPOTHETICAL: NO 
Civ) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 
GATAAAGGCA TGCTTCAGCT TGTGAACG 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer FL69 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TGTACTTCTC TAGAAGCTGA ACAGCAG 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer FL64 
(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
CTGAAGCATG TCTTTGTCAC CGGTTACTAT CAATAT 
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(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer FL65 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 
TAGTAACCGG TGACAAAG 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer FL66 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CTATGCCATG GATAGATCGC TTTCTACTTC C 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer FL67 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
CAAGCCCATG GAAACTTACA AGGCTCAAAG A 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer T2A292 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
GTCGGCATAT GGCTCCTGCT CCTCTTGAGG AGGCCCCCTG GCCCCCGCC 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer TZROl 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
GACGCAGATC TCAGCCCTTG GCGGAAAGCC AGTCCTC 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA primer TSA288 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: d 
GTCGGCATAT GGCTCCTAAA GAAGCTGAGG AGGCCCCCTG GCCCCCGCC 49 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer TSROl 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO • 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GACGCAGATC TCAGGCCTTG GCGGAAAGCC AGTCCTC 37 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer DG122 
(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO ^ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: ' ' 
CCTCTAAACG GCAGATCTGA TATCAACCCT TGGCGGAAAG C . / 1 
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(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 nucleotides 
i (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

1 

(ii) MOLECULE TYPE: DNA primer TAFI285 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GTCGGCATAT GATTAAAGAA CTTAATTTAC AAGAAAAATT AGAAAAGG 48 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA primer TAFROl 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
CCTTTACCCC AGGATCCTCA TTCCCACTCT TTTCCATAAT AAACAT 



i 
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1. A recombinant thermostable DNA polymerase enzyme 
which exhibits altered 5' to 3' exonucl 

5 activity from that of its native DNA polymerase. 

2. The recombinant thermostable DNA polymerase enzyme 
of claim l wherein a greater amount of 5' to 3' 
exonuclease activity is exhibited than that of the 

10 native DNA polymerase. 



3. 



15 



4 

20 



The recombinant thermostable DNA polymerase enzyme 
of claim 2 comprising the amino acid sequence 
A(X)yG wherein X is V or T (SEQ ID NO: 15), and/or 
the amino acid sequence X^^XgYKA wherein X^ is I, L 
or A and X3 is any sequence of three amino acids 

(SEQ ID NO: 20) . 

The recombinant thermostable DNA polymerase enzyme 
of claim 1 wherein a lesser amoxint of 5' to 3' 
exonuclease activity is exhibited than that of the 
native DNA polymerase. 

The recombinant thermostable DNA polymerase enzyme 
of claim 4 which in its native form comprises the 
amino acid sequence A(X)YG wherein X is V or T (SEQ 
ID NO: 15), said amino acid sequence being mutated 
or deleted in said recombinant enzyme. 

30 6. The recombinant thermostable DNA polymerase enzyme 
of claim 5 wherein G of SEQ ID NO: 15 is mutated. 

7. The recombinant thermostable DNA polymerase enzyme 
of claim 6 wherein G of SEQ ID NO: 15 is mutated to 
35 A. 



5 

25 
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8. The recombinant tihermostable DNA polymerase enzyme 
of claim 4 which in its native form comprises the 
amino acid sequence HEAYG (SEQ ID NO: 16), said 
amino acid sequence being mutated or deleted in 

5 said recombinant enzyme. 

9. The recombin2mt thermostable DNA polymerase enzyme 
of claim 4 which in its native form comprises the 
amino acid sequence HEAYE (SEQ ID NO: 17), said 

10 amino acid sec[uence being mutated or deleted in 

said recombinant enzyme. 

10. The recombinant thexTnosteible DNA polymerase enzyme 
of claim 4 which in its native form comprises the 

15 fiuaino acid sequence XLET wherein X is L or I (SEQ 

ID NO: 18), said amino acid sequence being mutated 
or deleted in said recombinant enzyme. 

11. The recombinant thermostable DNA polymerase enzyme 
20 of claim 4 selected from the group consisting of 

mutant forms of Thermus species spsl7, Thermus 
species 205, Thermus aguaticus, Thermus 
thermophilus , Thermos ipho africanus and Thermotoaa 
mar itima . 

25 

12. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermus aouaticus comprising amino acids 77-832 of 
SEQ ID N0:2. 

30 

13 . The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermus acruaticus comprising amino acids 47-832 of 
SEQ ID NO: 2. 
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14. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Tfa^pn^s acpiatjgys comprising amino acids 155-832 of 
SEQ ID NO: 2. 

5 

15. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
^^^^'^^g aquatiqps comprising amino acids 203-832 of 

SEQ ID NO: 2. 

10 

16. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Th^rmus ^cpiatjcus comprising amino acids 290-832 of 

SEQ ID NO: 2. 



17. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Theypiotoqa mayitjpa comprising amino acids 38-893 
of SEQ ID NO: 4. 

20 

18. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Theypotoqa mayitima comprising amino acids 21-893 
of SEQ ID NO: 4. 



19. The recombinant thermostable DNA polymerase enzyme 
of claim ll wherein said enzyme is a mutant form of 
Tfaermptoqa pi^yjtjpia comprising amino acids 74-893 
of SEQ ID NO: 4. 

30 

20. The recombinant thermostable DNA polymerase enzyme 
of claim ll wherein said enzyme is a mutant form of 
Thernotoqa maritima comprising amino acids 140-893 * 
of SEQ ID NO: 4. 

35 
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21. The recombinant thenQostcJ3le DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermotoaa maritima comprising sunino acids 284--893 
of SEQ ID NO: 4. 

5 

22. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermus species spsl7 comprising amino acids 44-830 
of SEQ ID NO: 6. 

10 

23. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermus species spsl7 comprising auaino acids 74-830 

of SEQ ID NO: 6. 

15 

24. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermus species spsl7 comprising amino acids 
152-830 of SEQ ID NO: 6. 

20 

25. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermus species spsl7 comprising amino acids 
200-830 of SEQ ID NO: 6. 

25 

26. The recombinant thermosteUale DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermus species spsl7 comprising eunino acids 

288-830 of SEQ ID NO: 6. 

30 

27. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermus species Z05 comprising amino acids 47-834 
of SEQ ID NO: 8. 

35 
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28. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
T^^erm^S species ZOS comprising amino acids 78-834 
of SEQ ID NO: 8. 

5 

29. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Tneriffius species ZOS comprising amino acids 156-834 
of SEQ XD NO: 8. 

10 

30. The recombinant thermosteOjle DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
T^etTans species ZOS comprising amino acids 204-834 
of SEQ ID NO: 8. 

IS 

31. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Tfteypus species ZOS comprising amino acids 292-834 
of SEQ ID NO: 8. 

20 

32. The recombinant thermostcible DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermus thermophilus comprising amino acids 47-834 
of SEQ ID NO: 10. 

2S 

33. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermus thermophilus comprising amino acids 78-834 
of SEQ ID NO: 10. 

30 

34. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
They^us thermophilus comprising amino acids 1S6-834 
of SEQ ID NO: 10. 

35 
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35* The recoBibinant: t:hennost:able DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Th^rfflys thermophllus comprising amino acids 204-834 
of SEQ ID NO: 10. 

5 

36. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermus thermophilus comprising amino acids 292-834 
of SEQ ID NO: 10. 

10 

37. The recombinant thermostcU^le DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermos ipho africanus comprising euaino acids 38-892 
of SEQ ID NO: 12. 



38. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermos jpho africanus comprising amino acids 94-892 

of SEQ ID NO: 12. 

20 

39. The recoiabinant thermosteible DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermos ipho africanus comprising amino acids 
140-892 of SEQ ID NO: 12. 



40. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermos ipho africanus comprising eunino acids 
204-892 of SEQ ID NO: 12. 

30 

41. The recombinant thermostable DNA polymerase enzyme 
of claim 11 wherein said enzyme is a mutant form of 
Thermo s ipho africanus comprising amino acids 

285-892 of SEQ ID NO: 12. 

35 
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42. A DNA sequence which encodes a thermostable DNA 

polymerase enzyme of claim 11 wherein said enzyme 

is a mutant form of ExSEfflus aouatieiis , said DNA 

sequence comprising nucleotides 229-2499 of SEQ id 
5 NO:l. 



. A DNA sequence which encodes a thermostable DNA 

polymerase enzyme of claim li wherein said enzyme 

is a mutant form of Thermus aouatieu^ , said DNA 

sequence comprising nucleotides 139-2499 of SEQ ID 
NO:l. 



. A DNA sequence which encodes a thermostable DNA 

polymerase enzyme of claim 11 wherein said enzyme 

is a mutant form of Thermus aouaticus . said DNA 

sequence comprising nucleotides 463-2499 of SEQ ID 
NO:l. 



. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim li wherein said enzyme 
is a mutant form of Thermus aouaticus . said DNA 
sequence comprising nucleotides 607-2499 of SEQ ID 
NO:l. 



25 46. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim 11 wherein said enzyme 
is a mutant form of Thermus aouatieus . said DNA 
sequence comprising nucleotides 868-2499 of SEQ ID 
NO:l. 

30 

47. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim 11 wherein said enzyme 
is a mutant form of Thermotoera maritima . said DNA 
sequence comprising nucleotides 132-2682 of SEQ ID 
35 NO: 3. 
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48. A DNA sequence which encodes a lihermos'bable DNA 
polymerase enzyme of claim 11 wherein said enzyme 

is a mut:an'b form of Thermotoaa mariiiima . said DNA 
sequence comprising nucleotides 61-2682 of SEQ ID 
5 NO: 3. 

49. A DNA setjuence which encodes a Uiermostable DNA 
polymerase enzyme of claim 11 wherein said enzyme 

is a mutant form of Thermotoga maritlma . said DNA 
10 sequence comprising nucleotides 220-2682 of SEQ ID 

NO: 3. 

50. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim 11 wherein said enzyme 

15 is a mutant form of Thermotoaa maritima . said DNA 

sequence comprising nucleotides 418-2682 of SEQ ID 
NO: 3. 

51. A DNA sequence which encodes a thermostable DNA 
20 polymerase enzyme of claim 11 wherein said enzyme 

is a mutant form of Thermotoaa maritima - said DNA 
sequence comprising nucleotides 850-2682 of SEQ ID 
NO: 3. 

25 52. A DNA sequence which encodes a thermostedDle DNA 
polymerase enzyme of claim ll wherein said enzyme 
is a mutant form of Thermus species spsl7, said DNA 
sequence comprising nucleotides 130-2493 of SEQ ID 
NO: 5. 

30 

53. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim 11 wherein said enzyme 
is a mutant form of Thermus species spsl7, said DNA 
sequence comprising nucleotides 220-2493 of SEQ ID 
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. A DHA eaquenoe which «codes a thermostable DH* 
poljnnerase enr^ of clai. „ wherein said enzy»e 
is a -utant fonn or Jhanms sp«:les spsi7, said DNA 
se^ence co.^risl„g nucleotides 454-2493 of SEO id 



SB. A DBA sequence which encodes a ther«,stable ma 
^lymerase enzjnae of data u wherein said enzyme 

sequence ccmprxsing nucleotides 598-2493 of SEO lo 

LreJ"^""" ^ thermostable DMA 

15 ^^^^"^^ clai- 11 Wherein said enzyme 

xs a mutant form of Haaias species spsl7, said MA 
se^ence comprising nucleotides 862-2493 of SEO ID 

57. A DNA sequence ^ich encodes a thermostable DNA 
20 polymerase enzyme of claim ii wherein said enzyme 
is a mutant form of stisams. species zos, said DNA 
sequence comprising nucleotides 139-2505 of SEQ ID 

2S 58. A DNA sequence Which encodes a thermostable DNA 
polymer^e enzyme of claim li wherein said enzyme 
xs a mutant form of jbsDSas species 205, said Sa 
sequence comprising nucleotides 232-2505 of SEQ id 

NO I V . 



30 



59. A DNA sequence which encodes a thermostable DNA 
polymerase en2yme of claim ii wherein said enzyme 
xs a mutant form of Sjgosus species ZOS, said DNA 
sequence comprising nucleotides 476-2505 of SEQ ID 
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60. A DNA sequence which encodes a thermostcible DNA 
polymerase enzyme of claim 11 wherein said enzyme 
is a mutant form of Thermus species 205^ said DNA 
sequence comprising nucleotides 610-2505 of SEQ ID 

5 N0:7. 

61. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim 11 wherein said enzyme 
is a mutant form of Thermus species 205, said DNA 

10 sequence comprising nucleotides 874-2505 of SEQ ID 

NO: 7. 

62. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim 11 wherein said enzyme 

15 is a mutant form of Thermus thermonhilus . said DNA 

sequence comprising nucleotides 139-2505 of SEQ ID 
NO: 9. 

63. A DNA sec[uence which encodes a thermostable DNA 
20 polymerase enzyme of claim 11 wherein said enzyme 

is a mutant form of Thermus thermophilus ^ said DNA 
sequence comprising nucleotides 232-2505 of SEQ ID 
NO: 9. 

25 64. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim 11 wherein said enzyme 
is a mutant form of Thermus thermophilus . said DNA 
sequence comprising nucleotides 466-2505 of SEQ ID 
NO: 9. 

30 

65. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim 11 wherein said enzyme 
is a mutant form of Thermus thermophilus . said DNA 
sequence comprising nucleotides 610-2505 of SEQ ID 
35 NO: 9. 
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66. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim 11 wherein said enzyme 
is a mutant form of fflieEEiis thermophnng ^ said DNA 
sequence comprising nucleotides 874-2505 of SEQ ID 

5 NO:9. 

67. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim li wherein said enzyme 
is a mutant form of Thermos iphn af rlcanns . said DNA 

10 sequence comprising nucleotides 112-2679 of SEQ ID 
NO: 11. 

68. A DNA sequence which encodes a thermostable DNA 

polymerase enzyme of claim li wherein said enzyme 

is a mutant form of Thermos iph» af ricanus , said DNA 

sequence comprising nucleotides 280-2679 of SEQ ID 
NO:il. 

69. A DNA sequence which encodes a thermostable DNA 
20 polymerase enzyme of claim ii wherein said enzyme 

is a mutant form of Thermos iphn af rieanus . said DNA 

sequence comprising nucleotides 418-2679 of SEQ ID 
NO: 11. 

25 70. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim 11 wherein said enzyme 

is a mutant form of Thermos ^r^hr^ af ricanus . said DNA 

sequence comprising nucleotides 610-2679 of SEQ ID 
NO: 11. 

30 

71. A DNA sequence which encodes a thermostable DNA 

polymerase enzyme of claim 11 wherein said enzyme 

is a mutant form of Thermos jphn af ricanus , said DNA 

sequence comprising nucleotides 853-2679 of SEQ ID 
35 NO: 11. 
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72. A DNA sequence which encodes a thermostable DNA 
polymerase enzyme of claim 3. 

73. A DNA sequence which encodes a thermostable DNA 
5 polymerase enzyme of any of claim 5 through 10. 

74. A recombinant DNA vector comprising the DNA 
sequence of any of claims 42 through 73. 

10 75. A recombinant host cell transformed with the vector 
of claim 74. 
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